Methods for modulating embryonic stem cell differentiation

ABSTRACT

Described herein is Zscan4, a gene exhibiting 2-cell embryonic stage and embryonic stem cell specific expression. Identification of nine Zscan4 co-expressed genes is also described Inhibition of Zscan4 expression inhibits the 2-cell to 4-cell embryonic transition and prevents blastocyst implantation, expansion and outgrowth. Provided herein are methods of inhibiting differentiation of a stem cell, promoting blastocyst outgrowth of embryonic stem cells and identifying a subpopulation of stem cells expressing Zscan4. Further described is the identification of Trim43 as a gene exhibiting morula-specific expression. Also provided are isolated expression vectors comprising a Zscan4 promoter, or a Trim43 promoter operably linked to a heterologous polypeptide and uses thereof. Further provided are transgenic animals comprising transgenes encoding marker proteins operably linked to Zscan4 and Trim43 promoters.

CROSS REFERENCE TO RELATED APPLICATION

This application claims the benefit of priority of U.S. ProvisionalApplication No. 60/920,215, filed Mar. 26, 2007, which is hereinincorporated by reference in its entirety.

FIELD

This application relates to the field of cellular differentiation,specifically to the methods of identifying and using a subpopulation ofstem cells, which can be identified by the expression of Zscan4 or oneor more Zscan4 co-expressed genes described herein, and the methods ofinhibiting differentiation and prolonging viability by altering Zscan4.This application also relates to the identification of Trim43 as a genehighly expressed at the morula stage.

BACKGROUND

Stem cells have been identified in several somatic tissues including thenervous system, bone marrow, epidermis, skeletal muscle, and liver. This‘set-aside’ population of cells is believed to be responsible formaintaining homeostasis within individual tissues in adult animals. Thenumber of stem cells and their decision to differentiate must be tightlycontrolled during embryonic development and in the adult animal to avoidpremature aging or tumor formation. Different somatic stem cells sharethe properties of self-renewal and multi-developmental potential,suggesting the presence of common cellular machinery.

Embryonic stem (ES) cells can proliferate indefinitely in anundifferentiated state. Furthermore, ES cells are pluripotent cells,meaning that they can generate all of the cells present in the body(bone, muscle, brain cells, etc.). ES cells have been isolated from theinner cell mass of the developing murine blastocyst (Evans et al.,Nature 292:154-156, 1981; Martin et al., Proc. Natl. Acad. Sci. U.S.A.78:7634-7636, 1981; Robertson et al., Nature 323:445-448, 1986;Doetschman et al., Nature 330:576-578, 1987; and Thomas et al., Cell51:503-512, 1987; U.S. Pat. No. 5,670,372). Additionally, human cellswith ES cell properties have recently been isolated from the innerblastocyst cell mass (Thomson et al., Science 282:1145-1147, 1998) anddeveloping germ cells (Shamblott et al., Proc. Natl. Acad. Sci. U.S.A.95:13726-13731, 1998) (see also U.S. Pat. No. 6,090,622, PCT PublicationNos. WO 00/70021 and WO 00/27995).

There is growing interest in the analysis of patterns of gene expressionin cells, such as stem cells. However, few studies have identified anindividual gene product that functions in the complex network of signalsin developing tissues to inhibit differentiation and increaseproliferation.

SUMMARY

Described herein is the identification of Zscan4 as a gene specificallyexpressed during the 2-cell embryonic stage and in embryonic stem cells.Further described herein is the identification of Zscan4 co-expressedgenes which exhibit a similar expression pattern as Zscan4 in thedeveloping embryo. Also described herein is the identification of Trim43as a gene abundantly expressed at the morula stage of embryonicdevelopment.

Provided herein are methods of inhibiting differentiation of a stem cellcomprising increasing the expression of Zscan4 in the stem cell. In oneembodiment, inhibiting differentiation of the stem cell increasesviability of the stem cells. In another embodiment, inhibitingdifferentiation of the stem cell prevents senescence of the stem cell.As described herein, the stem cell can be any type of stem cell,including, but not limited to, an embryonic stem cell, an embryonic germcell, a germline stem cell or a multipotent adult progenitor cell.

Also provided herein is a method of promoting blastocyst outgrowth of anembryonic stem cell, comprising increasing the expression of Zscan4 inthe embryonic stem cell, thereby promoting blastocyst outgrowth of theembryonic stem cell.

Further provided is a method of identifying an undifferentiatedsubpopulation of stem cells expressing Zscan4, comprising transfectingstem cells with an expression vector comprising a Zscan4 promoter and areporter gene, wherein expression of the reporter gene indicates Zscan4is expressed in the subpopulation of stem cells. In one embodiment, thepromoter is a Zscan4c promoter.

An isolated expression vector comprising a Zscan4 promoter operablylinked to a heterologous polypeptide is also provided. In oneembodiment, the Zscan4 promoter is a Zscan4c promoter. In anotherembodiment, the heterologous polypeptide is a marker, enzyme orfluorescent protein. Also provided is an expression vector comprising aTrim43 promoter operably linked to a heterologous polypeptide. In someembodiments, the Trim43 promoter comprises at least a portion of thenucleic acid sequence set forth as SEQ ID NO: 31. Isolated embryonicstem cells comprising the expression vectors described herein are alsoprovided.

Also provided is a method of identifying an undifferentiatedsubpopulation of stem cells, wherein the stem cells express Zscan4,comprising detecting expression of one or more of AF067063,Tcstyl/Tcstv3, Tho4, Arginase II, BC061212 and Gm428, Eif1a, EG668777and Pif1. Isolated stem cells identified according to this method arealso provided.

The foregoing and other features and advantages will become moreapparent from the following detailed description of several embodiments,which proceeds with reference to the accompanying figures.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1A is a series of digital images showing the expression profile ofZscan4 during preimplantation development by whole mount in situhybridization. Hybridizations were performed simultaneously under thesame experimental conditions for all preimplantation developmentalstages. Images were taken at 200× magnification using phase contrast.Zscan4 shows a transient and high expression in the late 2-cell embryos.Such a high level of expression was not observed in 3-cell (two examplesindicated by red arrows) and 4-cell embryos. FIG. 1B shows a graph ofthe expression levels of Zscan4 during preimplantation developmentquantitated by qRT-PCR analysis. Three sets of 10 pooled embryos werecollected from each stage (0, oocyte; 1,1-cell embryo; E2, early 2-cellembryo; L2, late 2-cell embryo; 4,4-cell embryo; 8,8-cell embryo; M,morula; and B, blastocyst) and used for qRT-PCR analysis. The expressionlevels of Zscan4 were normalized to Chuk control, and the averageexpression levels at each stage are represented as a fold changecompared to the expression level in oocytes.

FIG. 2A shows diagrams of the exon-intron structures of nine Zscan4paralogs. New proposed gene symbols are shown in bold italics with thecurrent gene symbols. FIG. 2B illustrates the putative proteinstructures of Zscan4 paralogs, and shows predicted domains.

FIG. 3A is a diagram that illustrates the genomic structure of theZscan4 locus (encompassing 850 kb on Chromosome 7). The top panel showsgenes near the Zscan4 locus. The lower panel shows nine Zscan4paralogous genes and their characteristic features. Six other genes(LOCs) are predicted in this region, but unrelated to Zscan4. FIG. 3B isa diagram that depicts the TaqI-, MspI-, or TaqI/MspI-digested DNAfragment sizes predicted from the genome sequences assembled fromindividual BAC sequences. FIG. 3C is a digital image that shows theSouthern blot analysis of C57BL/6J genomic DNAs digested with TaqI,MspI, or TaqI/MspI restriction enzymes. Sizes of all DNA fragmentshybridized with a Zscan4 probe (containing only exon 3 from cDNA cloneC0348C03) matched with those predicted in FIG. 3B, validating themanually assembled sequences.

FIG. 4A is a table showing the three types of siRNA technologies usedfor the analysis of Zscan4 in preimplantation embryos and their targetsequences (SEQ ID NOs: 54-59). FIG. 4B is a diagram that illustrates thelocations of siRNA target sequences in the Zscan4 cDNA. FIG. 4C is aseries of digital images showing the development of shZscan4-injectedembryos. The morphology of representative embryos is shown. Stages ofshZscan4-injected and shControl-injected embryos were assessed at 61hrs, 80 hrs, 98 hrs and 108 hrs post-hCG injections. FIG. 4D is a seriesof graphs showing the percentage of shZscan4- and shControl-injectedembryos at each developmental stage. shZscan4-injected (grey bars) andshControl-injected (white bars) were staged and counted at 61 hrs, 80hrs, 98 hrs and 108 hrs post-hCG injections (M=morula; B=blastocyst).FIG. 4E is a graph showing the transcript levels of Zscan4 inshControl-injected and shZscan4-injected 2-cell embryos by qRT-PCRanalysis. The expression levels were normalized by Eef1a1.

FIGS. 5A-5C are a series of graphs indicating the number of embryos ateach developmental stage following injection with shZscan4. Embryosreceived shZscan4-injection in the nucleus of one blastomere of early2-cell embryos. The stages of shZscan4-(gray) and shControl-(white)microinjected embryos were assessed at 52 hrs, 74 hrs and 96 hrspost-hCG injections. FIGS. 5D-5F show photographs of a 3-cell embryo(D), an unevenly cleaved embryo (E) and a mixed morula and blastocystlike embryo (F). The 3-cell embryo has one blastomere that remained atthe size of a 2-cell stage blastomere and two smaller blastomeres withthe size of 4-cell stage blastomeres. The 5-cell embryo has one delayedblastomere and four smaller blastomeres with the size of 8-cellblastomeres. These embryos eventually formed blastocyst-like structures,but seemed to be a mixture of a blastocyst-like cell mass and amorula-like cell mass. The morula-like cell mass was developed from oneblastomere receiving shZscan4 injection, as shown by the presence ofGFP, which was carried in the shZscan4 plasmid (FIG. 5G). Magnificationis 200×.

FIG. 6A is an image that illustrates the expression of Zscan4 and Pou5f1in blastocysts, blastocyst outgrowth and ES cells by whole mount in situhybridization. FIG. 6B is a schematic illustration of the Zscan4expression patterns.

FIGS. 7A-7E is a series of tables comparing nucleotide and amino acidsequence similarity (percent identity) among human ZSCAN4, mouseZscan4c, Zscan4d, and Zscan4f genes.

FIG. 8 is an illustration showing the Zscan4 syntenic regions of mouseand human genomes.

FIGS. 9A-9B is a series of graphs and photographs showing thedevelopment of embryos that received a siZscan4-injection in thecytoplasm. FIG. 9A shows the percentage of embryos at each developmentalstage for siControl-injected embryos (white bar) and siZscan4-injectedembryos (gray bar) at 2.0, 3.5 and 4.0 d.p.c. FIG. 9B shows thepercentage of expanded and hatched blastocysts at 4.5 d.p.c. insiControl-injected embryos (gray bar; photograph (a)) andsiZscan4-injected embryos (black bar; photograph (b)).

FIGS. 10A-10D are a series of graphs and a table showing the developmentof embryos that received plus-siZscan4-injection in cytoplasm. FIG. 10Ashows the percentage of embryos at each developmental stage forsiControl-injected embryos (white bar) and plus-siZscan4-injectedembryos (gray bar) at 2.0, 2.2, 3.0, and 4.0 days post coitus. FIGS. 10Band 10C show the transcript levels of Zscan4 in siControl-injectedembryos and plus-siZscan4-injected embryos, measured by qRT-PCR analysisand normalized by Chuk (FIG. 10B) and H2afz (FIG. 10C). FIG. 10Dprovides the raw data of 3 biological replications of qRT-PCR analysis.†, the mean value of the cycle threshold for each biological replicate;‡, the standard deviation.

FIG. 11 is an illustration depicting the expression vector comprisingthe Zscan4c promoter sequence and reporter gene Emerald. The sequence ofthe expression vector is set forth as SEQ ID NO: 28.

FIG. 12A is a fluorescence activated cell sorting (FACS) graph showing asubpopulation of mouse ES expressing Zscan4. Mouse ES cells weretransfected with an expression vector comprising a Zscan4c promoter anda fluorescent reporter gene (Emerald). Expression of the reporter genein a cell (an Emerald-positive cell) indicates the cell expressesZscan4. FIG. 12B is a graph showing expression levels of Zscan4c andPou5f1 in the subpopulation of ES cells identified as Emerald-positive.The Y-axis represents the fold difference in gene expression betweenEmerald-positive and Emerald-negative cells.

FIGS. 13A-G are graphs showing expression profiles of Zscan4 and sixgenes co-expressed with Zscan4 in a sub-population of ES cells. Shownare the expression profiles of Zscan4 (A), AF067063 (B), Tcstv3 (C),Tho4 (D), Arginase II (E), BC061212 (F) and Gm428 (G)) in metaphase IIoocytes (MII), 1 cell embryos, early 2 cell (e 2 cell) embryos, late 2cell (l 2 cell) embryos, 4 cell embryos, 8 cell embryos, morula (mo) andblastocyts (bl).

SEQUENCE LISTING

The nucleic and amino acid sequences listed in the accompanying sequencelisting are shown using standard letter abbreviations for nucleotidebases, and three letter code for amino acids, as defined in 37 C.F.R.1.822. Only one strand of each nucleic acid sequence is shown, but thecomplementary strand is understood as included by any reference to thedisplayed strand. In the accompanying sequence listing:

SEQ ID NOs: 1 and 2 are the nucleotide sequences of forward and reversePCR primers for amplification of Zscan4d from 2-cell embryos.

SEQ ID NOs: 3 and 4 are the nucleotide sequences of PCR primers foramplifying a probe designed to contain exon 3 of Zscan4.

SEQ ID NO: 5 is the nucleotide sequence of the Zscan4 PCR and sequencingprimer Zscan4_For.

SEQ ID NO: 6 is the nucleotide sequence of the Zscan4 PCR and sequencingprimer Zscan4_Rev.

SEQ ID NO: 7 is the nucleotide sequence of the Zscan4 sequencing primerZscan4_(—)400Rev.

SEQ ID NO: 8 is the nucleotide sequence of the Zscan4 sequencing primerZscan4_(—)300Rev.

SEQ ID NO: 9 is the nucleotide sequence of the shZscan4 siRNA.

SEQ ID NO: 10 is the nucleotide sequence of the siControl siRNA.

SEQ ID NO: 11 is the nucleotide sequence of Genbank Accession No.BC050218 (deposited Apr. 3, 2003), a cDNA clone derived from ES cells(Clone No. C0348C03).

SEQ ID NO: 12 is the nucleotide sequence of Zscan4-ps1.

SEQ ID NO: 13 is the nucleotide sequence of Zscan4-ps2.

SEQ ID NO: 14 is the nucleotide sequence of Zscan4-ps3.

SEQ ID NOs: 15 and 16 are the nucleotide and amino acid sequences ofZscan4a.

SEQ ID NOs: 17 and 18 are the nucleotide and amino acid sequences ofZscan4b.

SEQ ID NOs: 19 and 20 are the nucleotide and amino acid sequences ofZscan4c.

SEQ ID NOs: 21 and 22 are the nucleotide and amino acid sequences ofZscan4d.

SEQ ID NOs: 23 and 24 are the nucleotide and amino acid sequences ofZscan4e.

SEQ ID NOs: 25 and 26 are the nucleotide and amino acid sequences ofZscan4f.

SEQ ID NO: 27 is the nucleotide sequence of Genbank Accession No. XM145358, deposited Jan. 10, 2006, incorporated by reference herein.

SEQ ID NO: 28 is the nucleotide sequence of the Zscan4-Emeraldexpression vector.

SEQ ID NOs: 29 and 30 are the nucleotide and amino acid sequences ofhuman ZSCAN4 (Genbank Accession No. NM 152677, deposited Sep. 6, 2002,incorporated by reference herein).

SEQ ID NO: 31 is the nucleotide sequence of the Trim43 promoter.

SEQ ID NOs: 32 and 33 are the nucleotide and amino acid sequences ofTrim43.

SEQ ID NOs: 34 and 35 are the nucleotide and amino acid sequences ofAF067063, Genbank Accession No. NM 001001449, deposited May 29, 2004,incorporated by reference herein.

SEQ ID NOs: 36 and 37 are the nucleotide and amino acid sequences ofBC061212, Genbank Accession No. NM 198667.1, deposited Nov. 15, 2003,incorporated by reference herein.

SEQ ID NOs: 38 and 39 are the nucleotide and amino acid sequences ofGm428, Genbank Accession No. NM 001081644, deposited Feb. 22, 2007,incorporated by reference herein.

SEQ ID NOs: 40 and 41 are the nucleotide and amino acid sequences ofArginase II, Genbank Accession No. NM 009705, deposited Jan. 26, 2000,incorporated by reference herein.

SEQ ID NOs: 42 and 43 are the nucleotide and amino acid sequences ofTcstyl, Genbank Accession No. NM 018756, deposited Jul. 12, 2007,incorporated by reference herein.

SEQ ID NOs: 44 and 45 are the nucleotide and amino acid sequences ofTcstv3, Genbank Accession No. NM 153523, deposited Oct. 13, 2002,incorporated by reference herein.

SEQ ID NOs: 46 and 47 are the nucleotide and amino acid sequences ofTho4, Genbank Accession No. XM 902103, deposited Dec. 2, 2005,incorporated by reference herein.

SEQ ID NOs: 48 and 49 are the nucleotide and amino acid sequences ofEif1a, Genbank Accession No. NM 010120, deposited Aug. 3, 2002,incorporated by reference herein.

SEQ ID NOs: 50 and 51 are the nucleotide and amino acid sequences ofEG668777, Genbank Accession No. XM 001003556, deposited Apr. 27, 2006,incorporated by reference herein.

SEQ ID NOs: 52 and 53 are the nucleotide and amino acid sequences ofPif1, Genbank Accession No. NM 172453, deposited Dec. 24, 2002,incorporated by reference herein.

SEQ ID NO: 54 is the nucleotide sequence of the Plus-siZscan4(J-064700-05) target sequence.

SEQ ID NO: 55 is the nucleotide sequence of the Plus-siZscan4(J-064700-06) target sequence.

SEQ ID NO: 56 is the nucleotide sequence of the Plus-siZscan4(J-064700-07) target sequence.

SEQ ID NO: 57 is the nucleotide sequence of the Plus-siZscan4(J-064700-08) target sequence.

SEQ ID NO: 58: is the nucleotide sequence of the siZscan4 targetsequence.

SEQ ID NO: 59 is the nucleotide sequence of the of shZscan4 targetsequence.

SEQ ID NO: 60 is the nucleotide consensus sequence of nucleotides 1-1848of Zscan4c, Zscan4d and Zscan4f.

DETAILED DESCRIPTION I. Abbreviations

-   -   CDS Coding sequence    -   CMV Cytomegalovirus    -   DNA Deoxyribonucleic acid    -   d.p.c. Days post coitus    -   EC Embryonic carcinoma    -   EG Embryonic germ    -   ES Embryonic stem    -   GS Germline stem    -   GFP Green fluorescent protein    -   hCG Human chorionic gonadotropin    -   ICM Inner cell mass    -   IVF In vitro fertilization    -   LIF Leukemia inhibitory factor    -   maGSC Multipotent adult germline stem cell    -   MAPC Multipotent adult progenitor cell    -   PCR Polymerase chain reaction    -   qRT-PCR Quantitative reverse-transcriptase polymerase chain        reaction    -   RNA Ribonucleic acid    -   siRNA small interfering RNA    -   TS Trophoblast stem    -   USSC Unrestricted somatic stem cell    -   ZGA Zygotic genome activation

II. Terms

Unless otherwise noted, technical terms are used according toconventional usage. Definitions of common terms in molecular biology maybe found in Benjamin Lewin, Genes V, published by Oxford UniversityPress, 1994 (ISBN 0-19-854287-9); Kendrew et al. (eds.), TheEncyclopedia of Molecular Biology, published by Blackwell Science Ltd.,1994 (ISBN 0-632-02182-9); and Robert A. Meyers WO 2008/118957PCT/US2008/058261 (ed.), Molecular Biology and Biotechnology: aComprehensive Desk Reference, published by VCH Publishers, Inc., 1995(ISBN 1-56081-569-8).

In order to facilitate review of the various embodiments of theinvention, the following explanations of specific terms are provided:

Alter: A change in an effective amount of a substance of interest, suchas a polynucleotide or polypeptide. The amount of the substance can bechanged by a difference in the amount of the substance produced, by adifference in the amount of the substance that has a desired function,or by a difference in the activation of the substance. The change can bean increase or a decrease. The alteration can be in vivo or in vitro. Inseveral embodiments, altering an effective amount of a polypeptide orpolynucleotide is at least about a 50%, 60%, 70%, 80%, 90%, 95%, 96%,97%, 98%, 99%, or 100% increase or decrease in the effective amount(level) of a substance. Altering an effective amount of a polypeptide orpolypeptide includes increasing the expression of Zscan4 in a cell. Inanother embodiment, an alteration in a polypeptide or polynucleotideaffects a physiological property of a cell, such as the differentiation,proliferation, or viability of the cell. For example, increasingexpression of Zscan4 in a stem cell inhibits differentiation andpromotes viability of the stem cell.

Blastocyst: The structure formed in early mammalian embryogenesis, afterthe formation of the blastocele, but before implantation. It possessesan inner cell mass, or embryoblast, and an outer cell mass, ortrophoblast. The human blastocyst comprises 70-100 cells. As usedherein, blastocyst outgrowth refers to the process of culturingembryonic stem cells derived from the inner cell mass of a blastocyst.Promoting blastocyst outgrowth refers to enhancing the viability andproliferation of embryonic stem cells derived from the blastocyst.

cDNA (complementary DNA): A piece of DNA lacking internal, non-codingsegments (introns) and regulatory sequences that determinetranscription. cDNA is synthesized in the laboratory by reversetranscription from messenger RNA extracted from cells.

Co-expressed: In the context of the present disclosure, genes that are“co-expressed” with Zscan4 (also referred to as “Zscan4 co-expressedgenes”) are genes that exhibit a similar expression pattern as Zscan4during embryonic development and in ES cells. Specifically, theco-expressed genes are expressed in the same undifferentiatedsubpopulation of ES cells as Zscan 4, and during embryonic development,these genes are most abundantly expressed at the 2-cell stage. Nineco-expressed genes are described herein, including AF067063,Tcstyl/Tcstv3, Tho4, Arginase II, BC061212 and Gm428, Eif1a, EG668777and Pif1. However, co-expressed genes are not limited to those disclosedherein, but include any genes exhibiting an expression pattern similarto Zscan4.

AF067063 encodes hypothetical protein LOC380878. The full length cDNAsequence of AF067063 (SEQ ID NO: 34) is 886 base pairs in length and isorganized into three exons encoding several hypothetical proteins (forexample, SEQ ID NO: 35), which appear to be mouse specific.

BC061212 encodes a protein belonging to the PRAME (preferentiallyexpressed antigen melanoma) family. The full length cDNA sequence ofBC061212 (SEQ ID NO: 36) is 1625 base pairs in length and is organizedinto four exons, encoding a protein of 481 residues in length (SEQ IDNO: 37).

Gm428 (gene model 428) encodes a hypothetical protein. The full lengthcDNA sequence of Gm428 (SEQ ID NO: 38) is 1325 base pairs in length andis organized into five exons encoding a protein of 360 residues inlength (SEQ ID NO: 39).

Arginase II belongs to the Arginase family and may play a role in theregulation of extra-urea cycle arginine metabolism, and indown-regulation of nitric oxide synthesis. The full length cDNA sequenceof Arginase II (SEQ ID NO: 40) is 1415 base pairs in length and isorganized into eight exons encoding a protein of 354 residues in length(SEQ ID NO: 41).

Tsctv1 and Tsctv3 are splice variants. The full length cDNA of Tsctvl(SEQ ID NO: 42) is 858 base pairs in length and contains two exonsencoding a protein of 171 residues (SEQ ID NO: 43). The full length cDNAsequence of Tsctv3 (SEQ ID NO: 44) is 876 base pairs in length andcontains one exon encoding a protein of 169 residues (SEQ ID NO: 45).This family of proteins consists of several hypothetical proteins ofapproximately 170 residues in length and appears to be mouse-specific.

Tho4 (also called EG627488) encodes a protein with an RNA recognitionmotif (RRM) involved in regulation of alternative splicing, and proteincomponents of small nuclear ribonucleoproteins (snRNPs). The full lengthcDNA sequence of Tho4 (SEQ ID NO: 46) is 811 base pairs in length and isorganized into three exons encoding a protein of 163 residues in length(SEQ ID NO: 47).

Eif1a belongs to the eukaryotic translation initiation factor family.The full length cDNA sequence of Eif1a (SEQ ID NO: 48) is 2881 basepairs in length and encodes a protein of 144 amino acids (SEQ ID NO:49).

EG668777 is a predicted gene having similarity to retinoblastoma-bindingprotein 6, isoform 2. The full length cDNA sequence of EG668777 is 1918base pairs in length (SEQ ID NO: 50) and contains one exon encoding aprotein of 547 residues (SEQ ID NO: 51).

Pif1 is an ATP-dependent DNA helicase. The full length cDNA sequence ofPif1 (SEQ ID NO: 52) is 3680 base pairs in length and contains 12 exonsencoding a protein of 650 amino acids (SEQ ID NO: 53).

Degenerate variant: A polynucleotide encoding a polypeptide, such as aZscan4 polypeptide, that includes a sequence that is degenerate as aresult of the genetic code. There are 20 natural amino acids, most ofwhich are specified by more than one codon. Therefore, all degeneratenucleotide sequences are included as long as the amino acid sequence ofthe polypeptide encoded by the nucleotide sequence is unchanged.

Differentiation: Refers to the process by which a cell develops into aspecific type of cell (for example, muscle cell, skin cell etc.). In thecontext of the present disclosure, differentiation of embryonic stemcells refers to the development of the cells toward a specific celllineage. As a cell becomes more differentiated, the cell loses potency,or the ability to become multiple different cell types. As used herein,inhibiting differentiation means preventing or slowing the developmentof a cell into a specific lineage.

Embryonic stem (ES) cells: Pluripotent cells isolated from the innercell mass of the developing blastocyst. “ES cells” can be derived fromany organism. ES cells can be derived from mammals. In one embodiment,ES cells are produced from mice, rats, rabbits, guinea pigs, goats,pigs, cows, monkeys and humans. Human and murine derived ES cells arepreferred. ES cells are pluripotent cells, meaning that they cangenerate all of the cells present in the body (bone, muscle, braincells, etc.). Methods for producing murine ES cells can be found in U.S.Pat. No. 5,670,372, herein incorporated by reference. Methods forproducing human ES cells can be found in U.S. Pat. No. 6,090,622, PCTPublication No. WO 00/70021 and PCT Publication No. WO 00/27995, hereinincorporated by reference.

Expand: A process by which the number or amount of cells in a cellculture is increased due to cell division. Similarly, the terms“expansion” or “expanded” refers to this process. The terms“proliferate,” “proliferation” or “proliferated” may be usedinterchangeably with the words “expand,” “expansion”, or “expanded.”Typically, during expansion, the cells do not differentiate to formmature cells.

Expression vector: A vector is a nucleic acid molecule allowinginsertion of foreign nucleic acid without disrupting the ability of thevector to replicate and/or integrate in a host cell. A vector caninclude nucleic acid sequences that permit it to replicate in a hostcell, such as an origin of replication. A vector can also include one ormore selectable marker genes and other genetic elements. An expressionvector is a vector that contains the necessary regulatory sequences toallow transcription and translation of inserted gene or genes.

Heterologous: A heterologous polypeptide or polynucleotide refers to apolypeptide or polynucleotide derived from a different source orspecies.

Host cells: Cells in which a vector can be propagated and its DNAexpressed. The cell may be prokaryotic or eukaryotic. The term alsoincludes any progeny of the subject host cell. It is understood that allprogeny may not be identical to the parental cell since there may bemutations that occur during replication. However, such progeny areincluded when the term “host cell” is used.

Isolated: An isolated nucleic acid has been substantially separated orpurified away from other nucleic acid sequences and from the cell of theorganism in which the nucleic acid naturally occurs, i.e., otherchromosomal and extrachromosomal DNA and RNA. The term “isolated” thusencompasses nucleic acids purified by standard nucleic acid purificationmethods. The term also embraces nucleic acids prepared by recombinantexpression in a host cell as well as chemically synthesized nucleicacids. Similarly, “isolated” proteins have been substantially separatedor purified from other proteins of the cells of an organism in which theprotein naturally occurs, and encompasses proteins prepared byrecombination expression in a host cell as well as chemicallysynthesized proteins.

Multipotent cell: Refers to a cell that can form multiple cell lineages,but not all cell lineages.

Non-human animal: Includes all animals other than humans. A non-humananimal includes, but is not limited to, a non-human primate, a farmanimal such as swine, cattle, and poultry, a sport animal or pet such asdogs, cats, horses, hamsters, rodents, such as mice, or a zoo animalsuch as lions, tigers or bears. In one example, the non-human animal isa transgenic animal, such as a transgenic mouse, cow, sheep, or goat. Inone specific, non-limiting example, the transgenic non-human animal is amouse.

Operably linked: A first nucleic acid sequence is operably linked to asecond nucleic acid sequence when the first nucleic acid sequence isplaced in a functional relationship with the second nucleic acidsequence. For instance, a promoter is operably linked to a codingsequence if the promoter affects the transcription or expression of thecoding sequence. Generally, operably linked nucleic acid sequences arecontiguous and where necessary to join two protein coding regions, inthe same reading frame.

Pharmaceutically acceptable carriers: The pharmaceutically acceptablecarriers of use are conventional. Remington's Pharmaceutical Sciences,by E. W. Martin, Mack Publishing Co., Easton, Pa., 15th Edition (1975),describes compositions and formulations suitable for pharmaceuticaldelivery of the fusion proteins herein disclosed.

In general, the nature of the carrier will depend on the particular modeof administration being employed. For instance, parenteral formulationsusually comprise injectable fluids that include pharmaceutically andphysiologically acceptable fluids such as water, physiological saline,balanced salt solutions, aqueous dextrose, glycerol or the like as avehicle. For solid compositions (e.g., powder, pill, tablet, or capsuleforms), conventional non-toxic solid carriers can include, for example,pharmaceutical grades of mannitol, lactose, starch, or magnesiumstearate. In addition to biologically-neutral carriers, pharmaceuticalcompositions to be administered can contain minor amounts of non-toxicauxiliary substances, such as wetting or emulsifying agents,preservatives, and pH buffering agents and the like, for example, sodiumacetate or sorbitan monolaurate.

Pharmaceutical agent: A chemical compound, small molecule, or othercomposition capable of inducing a desired therapeutic or prophylacticeffect when properly administered to a subject or a cell. “Incubating”includes a sufficient amount of time for a drug to interact with a cell.“Contacting” includes incubating a drug in solid or in liquid form witha cell.

Pluripotent cell: Refers to a cell that can form all of an organism'scell lineages (endoderm, mesoderm and ectoderm), including germ cells,but cannot form an entire organisms autonomously.

Polynucleotide: A nucleic acid sequence (such as a linear sequence) ofany length. Therefore, a polynucleotide includes oligonucleotides, andalso gene sequences found in chromosomes. An “oligonucleotide” is aplurality of joined nucleotides joined by native phosphodiester bonds.An oligonucleotide is a polynucleotide of between 6 and 300 nucleotidesin length. An oligonucleotide analog refers to moieties that functionsimilarly to oligonucleotides but have non-naturally occurring portions.For example, oligonucleotide analogs can contain non-naturally occurringportions, such as altered sugar moieties or inter-sugar linkages, suchas a phosphorothioate oligodeoxynucleotide. Functional analogs ofnaturally occurring polynucleotides can bind to RNA or DNA, and includepeptide nucleic acid (PNA) molecules.

Polypeptide: A polymer in which the monomers are amino acid residueswhich are joined together through amide bonds. When the amino acids arealpha-amino acids, either the L-optical isomer or the D-optical isomercan be used, the L-isomers being preferred. The terms “polypeptide” or“protein” as used herein are intended to encompass any amino acidsequence and include modified sequences such as glycoproteins. The term“polypeptide” is specifically intended to cover naturally occurringproteins, as well as those which are recombinantly or syntheticallyproduced.

The term “polypeptide fragment” refers to a portion of a polypeptidewhich exhibits at least one useful epitope. The term “functionalfragments of a polypeptide” refers to all fragments of a polypeptidethat retain an activity of the polypeptide, such as a Zscan4.Biologically functional fragments, for example, can vary in size from apolypeptide fragment as small as an epitope capable of binding anantibody molecule to a large polypeptide capable of participating in thecharacteristic induction or programming of phenotypic changes within acell, including affecting cell proliferation or differentiation. An“epitope” is a region of a polypeptide capable of binding animmunoglobulin generated in response to contact with an antigen. Thus,smaller peptides containing the biological activity of Zscan4, orconservative variants of Zscan4, are thus included as being of use.

The term “soluble” refers to a form of a polypeptide that is notinserted into a cell membrane.

The term “substantially purified polypeptide” as used herein refers to apolypeptide which is substantially free of other proteins, lipids,carbohydrates or other materials with which it is naturally associated.In one embodiment, the polypeptide is at least 50%, for example at least80% free of other proteins, lipids, carbohydrates or other materialswith which it is naturally associated. In another embodiment, thepolypeptide is at least 90% free of other proteins, lipids,carbohydrates or other materials with which it is naturally associated.In yet another embodiment, the polypeptide is at least 95% free of otherproteins, lipids, carbohydrates or other materials with which it isnaturally associated.

Conservative substitutions replace one amino acid with another aminoacid that is similar in size, hydrophobicity, etc. Examples ofconservative substitutions are shown below:

Original Conservative Residue Substitutions Ala Ser Arg Lys Asn Gln, HisAsp Glu Cys Ser Gln Asn Glu Asp His Asn; Gln Ile Leu, Val Leu Ile; ValLys Arg; Gln; Glu Met Leu; Ile Phe Met; Leu; Tyr Ser Thr Thr Ser Trp TyrTyr Trp; Phe Val Ile; Leu

Variations in the cDNA sequence that result in amino acid changes,whether conservative or not, should be minimized in order to preservethe functional and immunologic identity of the encoded protein. Thus, inseveral non-limiting examples, a Zscan4 polypeptide, or otherpolypeptides disclosed herein, includes at most two, at most five, atmost ten, at most twenty, or at most fifty conservative substitutions.The immunologic identity of the protein may be assessed by determiningwhether it is recognized by an antibody; a variant that is recognized bysuch an antibody is immunologically conserved. Any cDNA sequence variantwill preferably introduce no more than twenty, and preferably fewer thanten amino acid substitutions into the encoded polypeptide. Variant aminoacid sequences may be, for example, at least 80%, 90% or even 95% or 98%identical to the native amino acid sequence.

Primers: Short nucleic acids, for example DNA oligonucleotides tennucleotides or more in length, which are annealed to a complementarytarget DNA strand by nucleic acid hybridization to form a hybrid betweenthe primer and the target DNA strand, then extended along the target DNAstrand by a DNA polymerase enzyme. Primer pairs can be used foramplification of a nucleic acid sequence, e.g., by the polymerase chainreaction (PCR) or other nucleic-acid amplification methods known in theart.

Probes and primers as used herein may, for example, include at least 10nucleotides of the nucleic acid sequences that are shown to encodespecific proteins. In order to enhance specificity, longer probes andprimers may also be employed, such as probes and primers that comprise15, 20, 30, 40, 50, 60, 70, 80, 90 or 100 consecutive nucleotides of thedisclosed nucleic acid sequences. Methods for preparing and using probesand primers are described in the references, for example Sambrook et al.(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, N.Y.;Ausubel et al. (1987) Current Protocols in Molecular Biology, GreenePubl. Assoc. & Wiley-Intersciences; Innis et al. (1990) PCR Protocols, AGuide to Methods and Applications, Innis et al. (Eds.), Academic Press,San Diego, Calif. PCR primer pairs can be derived from a known sequence,for example, by using computer programs intended for that purpose suchas Primer (Version 0.5, 1991, Whitehead Institute for BiomedicalResearch, Cambridge, Mass.).

When referring to a probe or primer, the term “specific for (a targetsequence)” indicates that the probe or primer hybridizes under stringentconditions substantially only to the target sequence in a given samplecomprising the target sequence.

Prolonging viability: As used herein, “prolonging viability” of a stemcell refers to extending the duration of time a stem cell is capable ofnormal growth and/or survival.

Promoter: A promoter is an array of nucleic acid control sequences whichdirect transcription of a nucleic acid. A promoter includes necessarynucleic acid sequences near the start site of transcription. A promoteralso optionally includes distal enhancer or repressor elements. A“constitutive promoter” is a promoter that is continuously active and isnot subject to regulation by external signals or molecules. In contrast,the activity of an “inducible promoter” is regulated by an externalsignal or molecule (for example, a transcription factor).

Reporter gene: A reporter gene is a gene operably linked to another geneor nucleic acid sequence of interest (such as a promoter sequence).Reporter genes are used to determine whether the gene or nucleic acid ofinterest is expressed in a cell or has been activated in a cell.Reporter genes typically have easily identifiable characteristics, suchas fluorescence, or easily assayed products, such as an enzyme. Reportergenes can also confer antibiotic resistance to a host cell. In oneembodiment, the reporter gene encodes the fluorescent protein Emerald.In another embodiment, the reporter gene encodes the fluorescent proteinStrawberry.

Senescence: The inability of a cell to divide further. A senescent cellis still viable, but does not divide.

Stem cell: A cell having the unique capacity to produce unaltereddaughter cells (self-renewal; cell division produces at least onedaughter cell that is identical to the parent cell) and to give rise tospecialized cell types (potency). Stem cells include, but are notlimited to, ES cells, EG cells, GS cells, MAPCs, maGSCs and USSCs. Inone embodiment, stem cells can generate a fully differentiatedfunctional cell of more than one given cell type. The role of stem cellsin vivo is to replace cells that are destroyed during the normal life ofan animal. Generally, stem cells can divide without limit. Afterdivision, the stem cell may remain as a stem cell, become a precursorcell, or proceed to terminal differentiation. A precursor cell is a cellthat can generate a fully differentiated functional cell of at least onegiven cell type. Generally, precursor cells can divide. After division,a precursor cell can remain a precursor cell, or may proceed to terminaldifferentiation.

Subpopulation: An identifiable portion of a population. As used herein,a “subpopulation” of stem cells expressing Zscan4 is the portion of stemcells in a given population that has been identified as expressingZscan4. In one embodiment, the subpopulation is identified using anexpression vector comprising a Zscan4 promoter and a reporter gene,wherein detection of expression of the reporter gene in a cell indicatesthe cell expresses Zscan4 and is part of the subpopulation. As describedherein, the subpopulation of ES cells expressing Zscan4 can further beidentified by co-expression of one or more genes disclosed herein,including AF067063, Tcstyl/Tcstv3, Tho4, Arginase II, BC061212 andGm428, Eif1a, EG668777 and Pif1.

Totipotent cell: Refers to a cell that can form an entire organismautonomously. Only a fertilized egg (oocyte) possesses this ability(stem cells do not).

Transgenic animal: A non-human animal, usually a mammal, having anon-endogenous (heterologous) nucleic acid sequence present as anextrachromosomal element in a portion of its cells or stably integratedinto its germ line DNA (i.e., in the genomic sequence of most or all ofits cells). Heterologous nucleic acid is introduced into the germ lineof such transgenic animals by genetic manipulation of, for example,embryos or embryonic stem cells of the host animal according to methodswell known in the art. A “transgene” is meant to refer to suchheterologous nucleic acid, such as, heterologous nucleic acid in theform of an expression construct (such as for the production of a“knock-in” transgenic animal) or a heterologous nucleic acid that uponinsertion within or adjacent to a target gene results in a decrease intarget gene expression (such as for production of a “knock-out”transgenic animal).

Transfecting or transfection: Refers to the process of introducingnucleic acid into a cell or tissue. Transfection can be achieved by anyone of a number of methods, such as, but not limited to,liposomal-mediated transfection, electroporation and injection.

Trim43 (tripartite motif-containing protein 43): A gene identifiedherein as exhibiting morula-specific expression during embryonicdevelopment. The nucleotide and amino acid sequences of Trim43 areprovided herein as SEQ ID NO: 32 and SEQ ID NO: 33, respectively.

Zscan4: A group of genes identified herein as exhibiting 2-cellembryonic stage and ES cell-specific expression. In the mouse, the term“Zscan4” refers to a collection of genes including three pseudogenes(Zscanl-ps1, Zscan4-ps2 and Zscan4-ps3) and six expressed genes(Zscan4a, Zscan4b, Zscan4c, Zscan4d, Zscan4e and Zscan4f). As usedherein, Zscan4 also includes human ZSCAN4. Zscan4 refers to Zscan4polypeptides and Zscan4 polynucleotides encoding the Zscan4polypeptides.

Unless otherwise explained, all technical and scientific terms usedherein have the same meaning as commonly understood by one of ordinaryskill in the art to which this invention belongs. The singular terms“a,” “an,” and “the” include plural referents unless context clearlyindicates otherwise. Similarly, the word “or” is intended to include“and” unless the context clearly indicates otherwise. Hence “comprisingA or B” means including A, or B, or A and B. It is further to beunderstood that all base sizes or amino acid sizes, and all molecularweight or molecular mass values, given for nucleic acids or polypeptidesare approximate, and are provided for description. Although methods andmaterials similar or equivalent to those described herein can be used inthe practice or testing of the present invention, suitable methods andmaterials are described below. All publications, patent applications,patents, and other references mentioned herein are incorporated byreference in their entirety. In case of conflict, the presentspecification, including explanations of terms, will control. Inaddition, the materials, methods, and examples are illustrative only andnot intended to be limiting.

III. Overview of Several Embodiments

Disclosed herein are Zscan4 polypeptides and polynucleotides encodingthese polypeptides, which are of use in inhibiting differentiation andincreasing proliferation of cells, such as stem cells, includingembryonic stem cells. Stem cells, especially ES cells in theundifferentiated condition, were previously considered to be arelatively homogenous cell population. However, described herein is theunique expression of Zscan4 in a subpopulation of stem cells, whichestablishes the presence of a unique cell population amongundifferentiated ES cells and provides the means to identify and isolatethese cells. Also described herein is the identification of nine genesco-expressed with Zscan4 in the undifferentiated ES cell subpopulation.These genes include AF067063, Tcstyl/Tcstv3, Tho4, Arginase II, BC061212and Gm428, Eif1a, EG668777 and Pif1. Further described herein is theidentification of Trim43 as a gene exhibiting morula-specific geneexpression.

It is disclosed herein that Zscan4 is specifically expressed during the2-cell embryonic stage and in a subpopulation of embryonic stem cells.There is a genus of Zscan4-related genes, including three pseudogenes(Zscan4-ps1, Zscan4-ps2 and Zscan4-ps3) and six expressed genes(Zscan4a, Zscan4b, Zscan4c, Zscan4d, Zscan4e and Zscan4f). The Zscan4genus also includes human ZSCAN4. It is further disclosed herein thatAF067063, Tcstyl/Tcstv3, Tho4, Arginase II, BC061212 and Gm428, Eif1a,EG668777 and Pif1 are co-expressed with Zscan4 during embryonicdevelopment. Like Zscan4, during embryonic development, these genes areexpressed most abundantly at the 2-cell stage.

Methods are provided herein for inhibiting differentiation of a stemcell comprising increasing the expression of Zscan4 in the stem cell. Asdescribed herein, the use of Zscan4 includes the use of any Zscan4 gene,including Zscan4a, Zscan4b, Zscan4c, Zscan4d, Zscan4e, Zscan4f and humanZSCAN4. In some embodiments, the Zscan4 gene is at least 90%, at least95%, at least 96%, at least 97%, at least 98% or at least 99% identicalto Zscan4c (SEQ ID NO: 19), Zscan4d (SEQ ID NO: 21) or Zscan4f (SEQ IDNO: 25). In another embodiment, the Zscan4 gene comprises SEQ ID NO: 60.

Increasing expression of Zscan4 in a cell, such as a stem cell, can beachieved according to any number of methods well known in the art. Inone embodiment, increasing expression of Zscan4 in a stem cell comprisestransfecting the stem cell with a nucleotide encoding Zscan4 operablylinked to a promoter. The promoter can be any type of promoter,including a constitutive promoter or an inducible promoter. In oneembodiment, the stem cells are transfected with a vector comprising thenucleotide sequence encoding Zscan4 operably linked to the promoter. Thevector can be any type of vector, such as a viral vector or a plasmidvector. In one embodiment, the Zscan4 nucleotide sequence is at least90%, at least 95%, at least 96%, at least 97%, at least 98% or at least99% identical to Zscan4c (SEQ ID NO: 19), Zscan4d (SEQ ID NO: 21) orZscan4f (SEQ ID NO: 25). In another embodiment, the Zscan4 nucleotidesequence comprises SEQ ID NO: 60.

In one embodiment of the methods described herein, inhibitingdifferentiation of the stem cell increases viability of the stem cells.In another embodiment, inhibiting differentiation of the stem cellprevents senescence of the stem cell. As described herein, the stem cellcan be any type of stem cell, including, but not limited to, anembryonic stem cell, an embryonic germ cell, a germline stem cell or amultipotent adult progenitor cell.

Also provided herein is a method of promoting blastocyst outgrowth of anembryonic stem cell, comprising increasing the expression of Zscan4 inthe embryonic stem cell, thereby promoting blastocyst outgrowth of theembryonic stem cell. Promoting blastocyst outgrowth can includeincreasing the efficiency of outgrowth or increasing the number ofembryonic stem cells resulting from blastocyst outgrowth. In oneembodiment, the method comprises increasing expression of Zscan4 in thecells during the early stages of blastocyst outgrowth, such as prior toproliferation of the stem cells. As described herein, Zscan4 includesany Zscan4 gene, including Zscan4a, Zscan4b, Zscan4c, Zscan4d, Zscan4e,Zscan4f, and human ZSCAN4. In one embodiment, the Zscan4 gene is atleast 90%, at least 95%, at least 96%, at least 97%, at least 98% or atleast 99% identical to Zscan4c (SEQ ID NO: 19), Zscan4d (SEQ ID NO: 21)or Zscan4f (SEQ ID NO: 25). In another embodiment, the Zscan4 genecomprises SEQ ID NO: 60.

In one embodiment, increasing the expression of Zscan4 in the stem cellcomprises transfecting the stem cell with a nucleotide sequence encodinga Zscan4 operably linked to a promoter. The promoter can be any type ofpromoter, including an inducible promoter or a constitutive promoter. Inone embodiment, the cells are transfected with a vector comprising thenucleotide encoding Zscan4 operably linked to a promoter. The vector canbe any type of vector, including a viral vector or a plasmid vector.

A method is also provided for identifying a subpopulation of stem cellsexpressing Zscan4, comprising transfecting the cells with an expressionvector comprising a Zscan4 promoter and a reporter gene, whereinexpression of the reporter gene indicates Zscan4 is expressed in thesubpopulation of stem cells. In one embodiment, the promoter is aZscan4c promoter. In another embodiment, the Zscan4c promoter includesthe nucleic acid sequence set forth as nucleotides 1-2540 of SEQ ID NO:28, such as nucleotides 1-2643, 1-3250, or 1-3347 of SEQ ID NO: 28. Inanother embodiment, the expression vector comprises the nucleic acidsequence set forth as SEQ ID NO: 28. As described herein, thesubpopulation of ES cells expressing Zscan4 are in an undifferentiatedstate. Further provided is a method of identifying the undifferentiatedsubpopulation of ES cells by detecting expression of one or more Zscan4co-expressed genes, such as AF067063, Tcstyl/Tcstv3, Tho4, Arginase II,BC061212 and Gm428, Eif1a, EG668777 and Pif1. Detection of expression ofthese genes can be accomplished using any means well known in the art,such as, for example, RT-PCR, Northern blot or in situ hybridization.Further provided are isolated stem cells identified according to thismethod.

An isolated expression vector comprising a Zscan4 promoter operablylinked to a nucleic acid sequence encoding a heterologous polypeptide isalso provided. In one embodiment, the Zscan4 promoter is a Zscan4cpromoter. In another embodiment, the Zscan4c promoter comprises thenucleic acid sequence set forth as nucleotides 1-2540 of SEQ ID NO: 28,such as nucleotides 1-2643, 1-3250, or 1-3347 of SEQ ID NO: 28. Inanother embodiment, the heterologous polypeptide is a marker, enzyme orfluorescent protein. The expression vector can be any type of vector,including, but not limited to a viral vector or a plasmid vector.

Further provided herein is an ES cell line comprising an expressionvector comprising a Zscan4 promoter operably linked to a heterologouspolypeptide. In one embodiment, the Zscan4 promoter is a Zscan4cpromoter. In another embodiment, the Zscan4c promoter comprises thenucleic acid sequence set forth as nucleotides 1-2540 of SEQ ID NO: 28,such as nucleotides 1-2643, 1-3250, or 1-3347 of SEQ ID NO: 28. Inanother embodiment, the heterologous polypeptide is a marker, enzyme orfluorescent protein. In one example, the fluorescent protein is Emerald.

An isolated expression vector comprising a Trim43 promoter operablylinked to a nucleic acid sequence encoding a heterologous polypeptide isalso provided. In one embodiment, the Trim43 promoter comprises at leasta portion of the nucleic acid sequence set forth as SEQ ID NO: 31. Theportion of SEQ ID NO: 31 to be included in the expression vector is atleast a portion of SEQ ID NO: 31 that is capable of promotingtranscription of the heterologous polypeptide in a cell in which Trim43is expressed. In some embodiments, the Trim43 promoter sequence is atleast 70%, at least 80%, at least 90%, at least 95% or at least 99%identical to SEQ ID NO: 31. In another embodiment, the Trim43 promotercomprises SEQ ID NO: 31. In another embodiment, the Trim43 promoterconsists of SEQ ID NO: 31. In some embodiments, the heterologouspolypeptide is a marker, enzyme or fluorescent protein. In one examplethe fluorescent protein is Strawberry. The expression vector can be anytype of vector, including, but not limited to a viral vector or aplasmid vector.

Further provided herein is an ES cell line containing an expressionvector comprising a Trim43 promoter operably linked to a heterologouspolypeptide. In one embodiment, the Trim43 promoter comprises at least aportion of the nucleic acid sequence set forth as SEQ ID NO: 31. In someembodiments, the Trim43 promoter sequence is at least 70%, at least 80%,at least 90%, at least 95% or at least 99% identical to SEQ ID NO: 31.In another embodiment, the Trim43 promoter comprises SEQ ID NO: 31. Inanother embodiment, the Trim43 promoter consists of SEQ ID NO: 31. Inanother embodiment, the heterologous polypeptide is a marker, enzyme orfluorescent protein. In one example, the fluorescent protein isStrawberry.

Provided herein are antibodies specific for Zscan4. In one embodiment,the Zscan4 antibodies specifically recognize Zscan4a, Zscan4b, Zscan4c,Zscan4d, Zscan4e, Zscan4f or human ZSCAN4. Also provided are antibodiesspecific for each Zscan4 co-expressed gene, including antibodies raisedagainst at least a portion of a polypeptide encoded by AF067063,Tcstv1/Tcstv3, Tho4, Arginase II, BC061212 and Gm428, Eif1a, EG668777 orPif1.

Also described herein are transgenic animals harboring a transgene thatincludes the Zscan4 polynucleotide sequences disclosed herein. Alsoprovided are transgenic animals harboring a transgene that includespolynucleotide sequences of one or more of the Zscan4 co-expressedgenes. Such transgenic animals include, but are not limited to,transgenic mice.

Further provided is a transgenic non-human animal comprising a nucleicacid sequence (a transgene) encoding a heterologous polypeptide operablylinked to a Zscan4 promoter. In some embodiments, the heterologouspolypeptide is a marker, enzyme or fluorescent protein. In oneembodiment, the heterologous polypeptide is fluorescent protein. In oneexample, the fluorescent protein is Emerald. In one embodiment, theZscan4 promoter is a Zscan4c promoter. In another embodiment, theZscan4c promoter comprises the nucleic acid sequence set forth asnucleotides 1-2540 of SEQ ID NO: 28, such as nucleotides 1-2643, 1-3250,or 1-3347 of SEQ ID NO: 28.

In another embodiment, the transgenic non-human animal further comprisesa nucleic acid sequence encoding a heterologous polypeptide operablylinked to a Trim43 promoter. In one embodiment, the Trim43 promotercomprises the nucleic acid sequence set forth as SEQ ID NO: 31. Theheterologous polypeptide can be, for example, a marker, enzyme orfluorescent protein. In one embodiment, the heterologous polypeptide isa fluorescent protein. In one example, the fluorescent protein isStrawberry. In some embodiments, the transgenic non-human animal is atransgenic mouse.

Also provided herein are isolated embryonic stem cells obtained from anembryo of the transgenic non-human animal. In one embodiment, thetransgenic non-human animal is a transgenic mouse.

IV. Methods of Inducing Differentiation and/or Inhibiting Proliferationof Stem Cells

A method for inhibiting differentiation of a stem cell is disclosedherein. A method for increasing viability and/or inducing proliferationof a stem cell is also disclosed herein. A method is also providedherein for inhibiting senescence of a stem cell. The methods includealtering the level of a Zscan4 polypeptide in the cell, therebyinhibiting differentiation and/or inducing proliferation of the cell,and/or inhibiting senescence of the cell. The cell can be in vivo or invitro.

It is shown herein that inhibiting Zscan4 in embryos blocks the 2- to4-cell stage embryonic transition Inhibition of Zscan4 expression alsoprevents blastocysts from expanding and implanting and prevents theoutgrowth of embryonic stem cells from blastocysts. In addition, inembryonic stem cells, Zscan4 expression is only detected in asubpopulation of undifferentiated stem cells. Thus, expression of Zscan4plays an important role in maintaining ES cells in an undifferentiatedstate, which is necessary for ES cell viability and proliferation.Zscan4 is also important in allowing outgrowth of ES cells fromblastocysts. Therefore, provided herein are methods of increasingexpression of Zscan4 in a stem cell to inhibit differentiation, increaseviability and prevent senescence of a stem cell. The methods providedherein also include increasing expression of Zscan4 to promoteblastocyst outgrowth of ES cells.

Expression of Zscan4 can be increased to inhibit differentiation and/orinduce proliferation. In one example, expression of Zscan4 is increasedas compared to a control. Increased expression includes, but is notlimited to, at least a 20% increase in the amount of Zscan4 mRNA orpolypeptide in a cell as compared to a control, such as, but not limitedto, at least about a 30%, 50%, 75%, 100%, or 200% increase of Zscan4mRNA or polypeptide. Suitable controls include a cell not contacted withan agent that alters Zscan4 expression, or not transfected with a vectorencoding Zscan4, such as a wild-type stem cell. Suitable controls alsoinclude standard values. Exemplary Zscan4 amino acid sequences are setforth in the Sequence Listing as SEQ ID NO: 16 (Zscan4a), SEQ ID NO: 18(Zscan4b), SEQ ID NO: 20 (Zscan4c), SEQ ID NO: 22 (Zscan4d), SEQ ID NO:24 (Zscan4e), SEQ ID NO: 26 (Zscan4f) and SEQ ID NO: 30 (human ZSCAN4).

Specific, non-limiting examples of Zscan4 polypeptides includepolypeptides including an amino acid sequence at least about 80%, 85%,90%, 95%, or 99% homologous to the amino acid sequence set forth in SEQID NO: 16, 18, 20, 22, 24, 26 or 30. In a further embodiment, a Zscan4polypeptide is a conservative variant of SEQ ID NO: 16, 18, 20, 22, 24,26 or 30, such that it includes no more than fifty conservative aminoacid substitutions, such as no more than two, no more than five, no morethan ten, no more than twenty, or no more than fifty conservative aminoacid substitutions in SEQ ID NO: 16, 18, 20, 22, 24, 26 or 30. Inanother embodiment, a Zscan4 polypeptide has an amino acid sequence asset forth in SEQ ID NO: 16, 18, 20, 22, 24, 26 or 30.

Fragments and variants of a Zscan4 polypeptide can readily be preparedby one of skill in the art using molecular techniques. In oneembodiment, a fragment of a Zscan4 polypeptide includes at least 8, 10,15, or 20 consecutive amino acids of the Zscan4 polypeptide. In anotherembodiment, a fragment of a Zscan4 polypeptide includes a specificantigenic epitope found on a full-length Zscan4. In a furtherembodiment, a fragment of Zscan4 is a fragment that confers a functionof Zscan4 when transferred into a cell of interest, such as, but notlimited to, inhibiting differentiation or increasing proliferation ofthe cell.

One skilled in the art, given the disclosure herein, can purify a Zscan4polypeptide using standard techniques for protein purification. Thesubstantially pure polypeptide will yield a single major band on anon-reducing polyacrylamide gel. The purity of the Zscan4 polypeptidecan also be determined by amino-terminal amino acid sequence analysis.

Minor modifications of the Zscan4 polypeptide primary amino acidsequences may result in peptides which have substantially equivalentactivity as compared to the unmodified counterpart polypeptide describedherein. Such modifications may be deliberate, as by site-directedmutagenesis, or may be spontaneous. All of the polypeptides produced bythese modifications are included herein.

One of skill in the art can readily produce fusion proteins including aZscan4 polypeptide and a second polypeptide of interest. Optionally, alinker can be included between the Zscan4 polypeptide and the secondpolypeptide of interest. Fusion proteins include, but are not limitedto, a polypeptide including a Zscan4 polypeptide and a marker protein.In one embodiment, the marker protein can be used to identify or purifya Zscan4 polypeptide. Exemplary fusion proteins include, but are notlimited to, green fluorescent protein, six histidine residues, or mycand a Zscan4 polypeptide.

Polynucleotides encoding a Zscan4 polypeptide are also provided, and aretermed Zscan4 polynucleotides. These polynucleotides include DNA, cDNAand RNA sequences which encode a Zscan4. It is understood that allpolynucleotides encoding a Zscan4 polypeptide are also included herein,as long as they encode a polypeptide with the recognized activity, suchas the binding to an antibody that recognizes a Zscan4 polypeptide, ormodulating cellular differentiation or proliferation. Thepolynucleotides include sequences that are degenerate as a result of thegenetic code. There are 20 natural amino acids, most of which arespecified by more than one codon. Therefore, all degenerate nucleotidesequences are included as long as the amino acid sequence of the Zscan4polypeptide encoded by the nucleotide sequence is functionallyunchanged. A Zscan4 polynucleotide encodes a Zscan4 polypeptide, asdisclosed herein. Exemplary polynucleotide sequences encoding Zscan4 areset for in the Sequence Listing as SEQ ID NO: 12 (Zscan4-ps1), SEQ IDNO: 13 (Zscan4-ps2), SEQ ID NO: 14 (Zscan4-ps3), SEQ ID NO: 15(Zscan4a), SEQ ID NO: 17 (Zscan4b), SEQ ID NO: 19 (Zscan4c), SEQ ID NO:21 (Zscan4d), SEQ ID NO: 23 (Zscan4e), SEQ ID NO: 25 (Zscan4f) and SEQID NO: 29 (human ZSCAN4).

In some embodiments, the Zscan4 polynucleotide sequence is at least 90%,at least 95%, at least 96%, at least 97%, at least 98% or at least 99%identical to Zscan4c (SEQ ID NO: 19), Zscan4d (SEQ ID NO: 21) or Zscan4f(SEQ ID NO: 25). In another embodiment, the Zscan4 gene comprises SEQ IDNO: 60.

The Zscan4 polynucleotides include recombinant DNA which is incorporatedinto a vector; into an autonomously replicating plasmid or virus; orinto the genomic DNA of a prokaryote or eukaryote, or which exists as aseparate molecule (e.g., a cDNA) independent of other sequences. Thenucleotides can be ribonucleotides, deoxyribonucleotides, or modifiedforms of either nucleotide. The term includes single- anddouble-stranded forms of DNA. Also included in this disclosure arefragments of the above-described nucleic acid sequences that are atleast 15 bases in length, which is sufficient to permit the fragment toselectively hybridize to DNA that encodes the disclosed Zscan4polypeptide (e.g., a polynucleotide that encodes SEQ ID NO: 16, 18, 20,22, 24, 26 or 30) under physiological conditions. The term “selectivelyhybridize” refers to hybridization under moderately or highly stringentconditions, which excludes non-related nucleotide sequences.

Also contemplated herein is the use of a Zscan4 polynucleotide, or thecomplement of a Zscan4 polynucleotide, for RNA interference. Fragmentsof Zscan4 polynucleotides or their complements can be designed as siRNAmolecules to inhibit expression of one or more Zscan4 proteins. In oneembodiment, the siRNA compounds are fragments of a Zscan4 pseudogene.Methods of preparing and using siRNA are generally disclosed in U.S.Pat. No. 6,506,559, incorporated herein by reference (see also reviewsby Milhavet et al., Pharmacological Reviews 55:629-648, 2003; and Gitlinet al., J. Virol. 77:7159-7165, 2003; incorporated herein by reference).The double-stranded structure of siRNA can be formed by a singleself-complementary RNA strand or two complementary RNA strands.

The siRNA can comprise one or more strands of polymerizedribonucleotide, and may include modifications to either thephosphate-sugar backbone or the nucleoside. For example, thephosphodiester linkages of natural RNA can be modified to include atleast one of a nitrogen or sulfur heteroatom. Modifications in RNAstructure can be tailored to allow specific genetic inhibition whileavoiding a general panic response in some organisms which is generatedby dsRNA. Likewise, bases can be modified to block the activity ofadenosine deaminase.

Inhibition is sequence-specific in that nucleotide sequencescorresponding to the duplex region of the RNA are targeted for geneticinhibition. Nucleic acid containing a nucleotide sequence identical to aportion of a target sequence can be used for inhibition. RNA sequenceswith insertions, deletions, and single point mutations relative to thetarget sequence have also been found to be effective for inhibition.Sequence identity may be optimized by alignment algorithms known in theart and calculating the percent difference between the nucleotidesequences. Alternatively, the duplex region of the RNA can be definedfunctionally as a nucleotide sequence that is capable of hybridizingwith a portion of the target gene transcript.

Sequence identity can optimized by sequence comparison and alignmentalgorithms known in the art (see Gribskov and Devereux, SequenceAnalysis Primer, Stockton Press, 1991, and references cited therein) andcalculating the percent difference between the nucleotide sequences by,for example, the Smith-Waterman algorithm as implemented in the BESTFITsoftware program using default parameters (e.g., University of WisconsinGenetic Computing Group). Greater than 90% sequence identity, or even100% sequence identity, between the inhibitory RNA and the portion ofparticular target gene sequence is preferred. Alternatively, the duplexregion of the RNA can be defined functionally as a nucleotide sequencethat is capable of hybridizing with a portion of the particular targetgene (e.g., 400 mM NaCl, 40 mM PIPES pH 6.4, 1 mM EDTA, 50° C. or 70° C.hybridization for 12-16 hours; followed by washing). The length of theidentical nucleotide sequences may be at least 20, 25, 50, 100, 200, 300or 400 bases. A 100% sequence identity between the RNA and Zscan4 is notrequired to practice the present methods.

For siRNA (RNAi), the RNA can be directly introduced into the cell (suchas intracellularly); or introduced extracellularly into a cavity,interstitial space, into the circulation of an organism, introducedorally, or may be introduced by bathing an organism in a solutioncontaining RNA. Physical methods of introducing nucleic acids includeinjection of a solution containing the RNA, bombardment by particlescovered by the RNA, soaking the cell or organism in a solution of theRNA, or electroporation of cell membranes in the presence of the RNA. Aviral construct packaged into a viral particle can efficiently introducean expression construct into the cell can provide transcription of RNAencoded by the expression construct. Other methods known in the art forintroducing nucleic acids to cells may be used, such as lipid-mediatedcarrier transport, chemical-mediated transport, such as calciumphosphate, and the like. Thus, the RNA may be introduced along withcomponents that perform one or more of the following activities: enhanceRNA uptake by the cell, promote annealing of the duplex strands,stabilize the annealed strands, or other-wise increase inhibition of thetarget gene.

RNA may be synthesized either in vivo or in vitro. Endogenous RNApolymerase of the cell can mediate transcription in vivo, or cloned RNApolymerase can be used for transcription in vivo or in vitro. Fortranscription from a transgene in vivo or an expression construct, aregulatory region can be used to transcribe the RNA strand (or strands).RNA may be chemically or enzymatically synthesized by manual orautomated reactions. The RNA may be synthesized by a cellular RNApolymerase or a bacteriophage RNA polymerase (for example, T3, T7, SP6).The use and production of expression constructs are known in the art(for example, PCT Publication No. WO 97/32016; U.S. Pat. Nos. 5,593,874,5,698,425, 5,712,135, 5,789,214, and 5,804,693; and the references citedtherein). If synthesized chemically or by in vitro enzymatic synthesis,the RNA can be purified prior to introduction into the cell. Forexample, RNA can be purified from a mixture by extraction with a solventor resin, precipitation, electrophoresis, chromatography, or acombination thereof. Alternatively, the RNA can be used with no or aminimum of purification to avoid losses due to sample processing. TheRNA can be dried for storage or dissolved in an aqueous solution. Thesolution can contain buffers or salts to promote annealing, and/orstabilization of the duplex strands.

A polynucleotide encoding Zscan4 can be included in an expression vectorto direct expression of the Zscan4 nucleic acid sequence. Thus, otherexpression control sequences including appropriate promoters, enhancers,transcription terminators, a start codon (i.e., ATG) in front of aprotein-encoding gene, splicing signal for introns, maintenance of thecorrect reading frame of that gene to permit proper translation of mRNA,and stop codons can be included in an expression vector. Generallyexpression control sequences include a promoter, a minimal sequencesufficient to direct transcription.

The expression vector typically contains an origin of replication, apromoter, as well as specific genes which allow phenotypic selection ofthe transformed cells (e.g. an antibiotic resistance cassette). Vectorssuitable for use include, but are not limited, to the pMSXND expressionvector for expression in mammalian cells (Lee and Nathans, J. Biol.Chem. 263:3521, 1988). Generally, the expression vector will include apromoter. The promoter can be inducible or constitutive. The promotercan be tissue specific. Suitable promoters include the thymidine kinasepromoter (TK), metallothionein I, polyhedron, neuron specific enolase,thyrosine hyroxylase, beta-actin, or other promoters. In one embodiment,the promoter is a heterologous promoter.

In one example, the polynucleotide encoding Zscan4 is located downstreamof the desired promoter. Optionally, an enhancer element is alsoincluded, and can generally be located anywhere on the vector and stillhave an enhancing effect. However, the amount of increased activity willgenerally diminish with distance.

Expression vectors including a polynucleotide encoding Zscan4 can beused to transform host cells. Hosts can include isolated microbial,yeast, insect and mammalian cells, as well as cells located in theorganism. Biologically functional viral and plasmid DNA vectors capableof expression and replication in a host are known in the art, and can beused to transfect any cell of interest. Where the cell is a mammaliancell, the genetic change is generally achieved by introduction of theDNA into the genome of the cell (i.e., stable) or as an episome. Thus,host cells can be used to produce Zscan4 polypeptides. Alternatively,expression vectors can be used to transform host cells of interest, suchas stem cells.

A “transfected cell” is a cell into which (or into an ancestor of which)has been introduced, by means of recombinant DNA techniques, a DNAmolecule encoding Zscan4. Transfection of a host cell with recombinantDNA may be carried out by conventional techniques as are well known inthe art. Where the host is prokaryotic, such as E. coli, competent cellswhich are capable of DNA uptake can be prepared from cells harvestedafter exponential growth phase and subsequently treated by the CaCl₂method using procedures well known in the art. Alternatively, MgCl₂ orRbCl can be used. Transformation can also be performed after forming aprotoplast of the host cell if desired, or by electroporation.

When the host is a eukaryote, such as a stem cell, such methods oftransfection of DNA as calcium phosphate co-precipitates, conventionalmechanical procedures such as microinjection, electroporation, insertionof a plasmid encased in liposomes, or virus vectors may be used.Eukaryotic cells can also be cotransformed with DNA sequences encodingZscan4, and a second foreign DNA molecule encoding a selectablephenotype, such as neomycin resistance. Another method is to use aeukaryotic viral vector, such as simian virus 40 (SV40) or bovinepapilloma virus, to transiently infect or transform eukaryotic cells andexpress the protein (see for example, Eukaryotic Viral Vectors, ColdSpring Harbor Laboratory, Gluzman ed., 1982). Other specific,non-limiting examples of viral vectors include adenoviral vectors,lentiviral vectors, retroviral vectors, and pseudorabies vectors.

Differentiation can be induced, or proliferation decreased, of any cell,either in vivo or in vitro, using the methods disclosed herein. In oneembodiment, the cell is a stem cell, such as, but not limited to, anembryonic stem cell, a germline stem cell or a multipotent adultprogenitor cell. In several examples, a Zscan4 polypeptide, or apolynucleotide encoding the Zscan4 polypeptide, is introduced into astem cell to decrease differentiation and/or increase proliferation.

In one example, the cells are stem cells, such as embryonic stem cells.For example, murine, primate or human cells can be utilized. ES cellscan proliferate indefinitely in an undifferentiated state. Furthermore,ES cells are totipotent cells, meaning that they can generate all of thecells present in the body (bone, muscle, brain cells, etc.). ES cellshave been isolated from the inner cell mass (ICM) of the developingmurine blastocyst (Evans et al., Nature 292:154-156, 1981; Martin etal., Proc. Natl. Acad. Sci. 78:7634-7636, 1981; Robertson et al., Nature323:445-448, 1986). Additionally, human cells with ES properties havebeen isolated from the inner blastocyst cell mass (Thomson et al.,Science 282:1145-1147, 1998) and developing germ cells (Shamblott etal., Proc. Natl. Acad. Sci. USA 95:13726-13731, 1998), and human andnon-human primate embryonic stem cells have been produced (see U.S. Pat.No. 6,200,806, which is incorporated by reference herein).

As disclosed in U.S. Pat. No. 6,200,806, ES cells can be produced fromhuman and non-human primates. In one embodiment, primate ES cells areisolated “ES medium” that express SSEA-3; SSEA-4, TRA-1-60, and TRA-1-81(see U.S. Pat. No. 6,200,806). ES medium consists of 80% Dulbecco'smodified Eagle's medium (DMEM; no pyruvate, high glucose formulation,Gibco BRL), with 20% fetal bovine serum (FBS; Hyclone), 0.1 mMβ-mercaptoethanol (Sigma), 1% non-essential amino acid stock (GibcoBRL). Generally, primate ES cells are isolated on a confluent layer ofmurine embryonic fibroblast in the presence of ES cell medium. In oneexample, embryonic fibroblasts are obtained from 12 day old fetuses fromoutbred mice (such as CF1, available from SASCO), but other strains maybe used as an alternative. Tissue culture dishes treated with 0.1%gelatin (type I; Sigma) can be utilized. Distinguishing features of EScells, as compared to the committed “multipotential” stem cells presentin adults, include the capacity of ES cells to maintain anundifferentiated state indefinitely in culture, and the potential thatES cells have to develop into every different cell types. Unlike mouseES cells, human ES (hES) cells do not express the stage-specificembryonic antigen SSEA-1, but express SSEA-4, which is anotherglycolipid cell surface antigen recognized by a specific monoclonalantibody (see, e.g., Amit et al., Devel. Biol. 227:271-278, 2000).

For rhesus monkey embryos, adult female rhesus monkeys (greater thanfour years old) demonstrating normal ovarian cycles are observed dailyfor evidence of menstrual bleeding (day 1 of cycle=the day of onset ofmenses). Blood samples are drawn daily during the follicular phasestarting from day 8 of the menstrual cycle, and serum concentrations ofluteinizing hormone are determined by radioimmunoassay. The female ispaired with a male rhesus monkey of proven fertility from day 9 of themenstrual cycle until 48 hours after the luteinizing hormone surge;ovulation is taken as the day following the leutinizing hormone surge.Expanded blastocysts are collected by non-surgical uterine flushing atsix days after ovulation. This procedure generally results in therecovery of an average 0.4 to 0.6 viable embryos per rhesus monkey permonth (Seshagiri et al., Am J Primatol. 29:81-91, 1993).

For marmoset embryos, adult female marmosets (greater than two years ofage) demonstrating regular ovarian cycles are maintained in familygroups, with a fertile male and up to five progeny. Ovarian cycles arecontrolled by intramuscular injection of 0.75 g of the prostaglandinPGF2a analog cloprostenol (Estrumate, Mobay Corp, Shawnee, Kans.) duringthe middle to late luteal phase. Blood samples are drawn on day 0(immediately before cloprostenol injection), and on days 3, 7, 9, 11,and 13. Plasma progesterone concentrations are determined by ELISA. Theday of ovulation is taken as the day preceding a plasma progesteroneconcentration of 10 ng/ml or more. At eight days after ovulation,expanded blastocysts are recovered by a non-surgical uterine flushprocedure (Thomson et al., J Med. Primatol. 23:333-336, 1994). Thisprocedure results in the average production of 1.0 viable embryos permarmoset per month.

The zona pellucida is removed from blastocysts, such as by briefexposure to pronase (Sigma). For immunosurgery, blastocysts are exposedto a 1:50 dilution of rabbit anti-marmoset spleen cell antiserum (formarmoset blastocysts) or a 1:50 dilution of rabbit anti-rhesus monkey(for rhesus monkey blastocysts) in DMEM for 30 minutes, then washed for5 minutes three times in DMEM, then exposed to a 1:5 dilution of Guineapig complement (Gibco) for 3 minutes. After two further washes in DMEM,lysed trophoectoderm cells are removed from the intact inner cell mass(ICM) by gentle pipetting, and the ICM plated on mouse inactivated (3000rads gamma irradiation) embryonic fibroblasts.

After 7-21 days, ICM-derived masses are removed from endoderm outgrowthswith a micropipette with direct observation under a stereo microscope,exposed to 0.05% Trypsin-EDTA (Gibco) supplemented with 1% chicken serumfor 3-5 minutes and gently dissociated by gentle pipetting through aflame polished micropipette.

Dissociated cells are re-plated on embryonic feeder layers in fresh ESmedium, and observed for colony formation. Colonies demonstratingES-like morphology are individually selected, and split again asdescribed above. The ES-like morphology is defined as compact colonieshaving a high nucleus to cytoplasm ratio and prominent nucleoli.Resulting ES cells are then routinely split by brief trypsinization orexposure to Dulbecco's Phosphate Buffered Saline (PBS, without calciumor magnesium and with 2 mM EDTA) every 1-2 weeks as the cultures becomedense. Early passage cells are also frozen and stored in liquidnitrogen.

Cell lines may be karyotyped with a standard G-banding technique (suchas by the Cytogenetics Laboratory of the University of Wisconsin StateHygiene Laboratory, which provides routine karyotyping services) andcompared to published karyotypes for the primate species.

Isolation of ES cell lines from other primate species would follow asimilar procedure, except that the rate of development to blastocyst canvary by a few days between species, and the rate of development of thecultured ICMs will vary between species. For example, six days afterovulation, rhesus monkey embryos are at the expanded blastocyst stage,whereas marmoset embryos do not reach the same stage until 7-8 daysafter ovulation. The rhesus ES cell lines can be obtained by splittingthe ICM-derived cells for the first time at 7-16 days afterimmunosurgery; whereas the marmoset ES cells were derived with theinitial split at 7-10 days after immunosurgery. Because other primatesalso vary in their developmental rate, the timing of embryo collection,and the timing of the initial ICM split, varies between primate species,but the same techniques and culture conditions will allow ES cellisolation (see U.S. Pat. No. 6,200,806, which is incorporated herein byreference for a complete discussion of primate ES cells and theirproduction). Human ES cell lines exist and can be used in the methodsdisclosed herein.

Human ES cells can also be derived from preimplantation embryos from invitro fertilized (IVF) embryos. Experiments on unused human IVF-producedembryos are allowed in many countries, such as Singapore and the UnitedKingdom, if the embryos are less than 14 days old. Only high qualityembryos are suitable for ES isolation. Present defined cultureconditions for culturing the one cell human embryo to the expandedblastocyst have been described (see Bongso et al., Hum Reprod.4:706-713, 1989). Co-culturing of human embryos with human oviductalcells results in the production of high blastocyst quality. IVF-derivedexpanded human blastocysts grown in cellular co-culture, or in improveddefined medium, allows isolation of human ES cells with the sameprocedures described above for non-human primates (see U.S. Pat. No.6,200,806).

Precursor cells can also be utilized with the methods disclosed herein.The precursor cells can be isolated from a variety of sources usingmethods known to one skilled in the art. The precursor cells can be ofectodermal, mesodermal or endodermal origin. Any precursor cells whichcan be obtained and maintained in vitro can potentially be used inaccordance with the present methods. Such cells include cells ofepithelial tissues such as the skin and the lining of the gut, embryonicheart muscle cells, and neural precursor cells (Stemple and Anderson,1992, Cell 71:973-985).

In one example, the cells are mesenchymal progenitor cells. Mesenchymalprogenitors give rise to a very large number of distinct tissues(Caplan, J. Orth. Res 641-650, 1991). Mesenchymal cells capable ofdifferentiating into bone and cartilage have also been isolated frommarrow (Caplan, J. Orth. Res. 641-650, 1991). U.S. Pat. No. 5,226,914describes an exemplary method for isolating mesenchymal stem cells frombone marrow.

In other examples, the cells are epithelial progenitor cells orkeratinocytes can be obtained from tissues such as the skin and thelining of the gut by known procedures (Rheinwald, Meth. Cell Bio.21A:229, 1980). In stratified epithelial tissue such as the skin,renewal occurs by mitosis of precursor cells within the germinal layer,the layer closest to the basal lamina. Precursor cells within the liningof the gut provide for a rapid renewal rate of this tissue. The cellscan also be liver stem cells (see PCT Publication No. WO 94/08598) orkidney stem cells (see Karp et al., Dev. Biol. 91:5286-5290, 1994).

In one non-limited example, neuronal precursor cells are utilized.Undifferentiated neural stem cells differentiate into neuroblasts andglioblasts which give rise to neurons and glial cells. Duringdevelopment, cells that are derived from the neural tube give rise toneurons and glia of the CNS. Certain factors present during development,such as nerve growth factor (NGF), promote the growth of neural cells.Methods of isolating and culturing neural stem cells and progenitorcells are well known to those of skill in the art (Hazel and Muller,1997; U.S. Pat. No. 5,750,376). Methods for isolating and culturingneuronal precursor cells are disclosed, for example, in U.S. Pat. No.6,610,540.

V. Zscan4 and Trim43 Promoter Sequences

A Zscan4 promoter or a Trim43 promoter can be included in an expressionvector to direct expression of a heterologous nucleic acid sequence.Other expression control sequences including appropriate enhancers,transcription terminators, a start codon (i.e., ATG) in front of aprotein-encoding gene, splicing signal for introns, maintenance of thecorrect reading frame of that gene to permit proper translation of mRNA,and stop codons can be included with the Zscan4 or Trim43 promoter in anexpression vector. Generally the promoter includes at least a minimalsequence sufficient to direct transcription of a heterologous nucleicacid sequence. In several examples, the heterologous nucleic acidsequence encodes a polypeptide. However, the heterologous nucleic acidcan be any RNA sequence of interest, such as an inhibitory RNA.

Expression vectors typically contain an origin of replication as well asspecific genes which allow phenotypic selection of the transformedcells. Vectors suitable for use include, but are not limited to thepMSXND expression vector for expression in mammalian cells (Lee andNathans, J. Biol. Chem. 263:3521, 1988). In one example, an enhancer islocated upstream of the Zscan4 or Trim43 promoter, but enhancer elementscan generally be located anywhere on the vector and still have anenhancing effect. However, the amount of increased activity willgenerally diminish with distance. Additionally, two or more copies of anenhancer sequence can be operably linked one after the other to producean even greater increase in promoter activity.

Generally, an expression vector includes a nucleic acid sequenceencoding a polypeptide of interest. A polypeptide of interest can be aheterologous polypeptide, such as a polypeptide that affects a functionof the transfected cell. Polypeptides of interest include, but are notlimited to, polypeptides that confer antibiotic resistance, receptors,oncogenes, and neurotransmitters. A polypeptide of interest can also bea marker polypeptide, which is used to identify a cell of interest.Marker polypeptides include fluorescent polypeptides, enzymes, orantigens that can be identified using conventional molecular biologyprocedures. For example, the polypeptide can be a fluorescent marker(such as green fluorescent protein, Emerald (Invitrogen, Carlsbad,Calif.), Strawberry (Clontech, Mountain View, Calif.), Aequoriavictoria, or Discosoma DSRed); an antigenic marker (such as human growthhormone, human insulin, human HLA antigens); a cell-surface marker (suchas CD4, or any cell surface receptor); or an enzymatic marker (such aslacZ, alkaline phosphatase). Techniques for identifying these markers inhost cells include immunohistochemistry and fluorescent microscopy, andare well known in the art.

RNA molecules transcribed from an expression vector need not always betranslated into a polypeptide to express a functional activity. Specificnon-limiting examples of other molecules of interest include antisenseRNA molecules complementary to an RNA of interest, ribozymes, smallinhibitory RNAs, and naturally occurring or modified tRNAs.

Expression vectors including a Zscan4 or Trim43 promoter can be used totransform host cells. Hosts can include isolated microbial, yeast,insect and mammalian cells, as well as cells located in the organism.Biologically functional viral and plasmid DNA vectors capable ofexpression and replication in a host are known in the art, and can beused to transfect any cell of interest. Where the cell is a mammaliancell, the genetic change is generally achieved by introduction of theDNA into the genome of the cell (stable integration). However, thevector can also be maintained as an episome.

A “transfected cell” is a host cell into which (or into an ancestor ofwhich) has been introduced, by means of recombinant DNA techniques, aDNA molecule including a Zscan4 promoter element. Transfection of a hostcell with recombinant DNA may be carried out by conventional techniquesas are well known to those skilled in the art. Where the host isprokaryotic, such as E. coli, competent cells which are capable of DNAuptake can be prepared from cells harvested after exponential growthphase and subsequently treated by the CaCl₂ method using procedures wellknown in the art. Alternatively, MgCl₂ or RbC1 can be used.Transformation can also be performed after forming a protoplast of thehost cell if desired, or by electroporation.

When the host is a eukaryote, such methods of transfection of DNA ascalcium phosphate co-precipitates, conventional mechanical proceduressuch as microinjection, electroporation, insertion of a plasmid encasedin liposomes, or virus vectors may be used. Eukaryotic cells can also becotransformed with DNA sequences including the Zscan4 promoter, and asecond foreign DNA molecule encoding a selectable phenotype, such asneomycin resistance. Another method is to use a eukaryotic viral vector,such as simian virus 40 (SV40) or bovine papilloma virus, to transientlyinfect or transform eukaryotic cells and express the protein (see forexample, Eukaryotic Viral Vectors, Cold Spring Harbor Laboratory,Gluzman ed., 1982). Other specific, non-limiting examples of viralvectors include adenoviral vectors, lentiviral vectors, retroviralvectors, and pseudorabies vectors.

In one embodiment described in the Examples below, an expression vectorcomprising a Zsan4 promoter sequence operably linked to a heterologouspolypeptide is used to identify cells that express Zscan4. In oneembodiment, the Zscan4 promoter is a Zscan4c promoter. In someembodiments, the Zscan4c promoter comprises Zsan4c exon and/or intronsequence. The heterologous protein is typically a marker, an enzyme, ora fluorescent protein. In one embodiment, the heterologous protein isgreen fluorescent protein (GFP), or a variant of GFP, such as Emerald.

Provided herein is a method of identifying a subpopulation of stem cellsexpressing Zscan4. In one embodiment, the subpopulation is identified bytransfecting the stem cells with an expression vector, wherein theexpression vector comprises a Zscan4 promoter sequence and a reportergene. In one embodiment, the Zscan4 promoter is a Zscan4c promoter. Inanother embodiment, the Zscan4c promoter comprises the nucleic acidsequence set forth as nucleotides 1-2540 of SEQ ID NO: 28, such asnucleotides 1-2643, 1-3250, or 1-3347 of SEQ ID NO: 28.

The reporter gene can be any type of identifiable marker, such as anenzyme or a fluorescent protein. In one embodiment, the reporter gene isGFP or a variant of GFP, such as Emerald. Expression of the reportergene indicates the cell expresses Zscan4. Methods of detectingexpression of the reporter gene vary depending upon the type of reportergene and are well known in the art. For example, when a fluorescentreporter is used, detection of expression can be achieved byfluorescence activated cell sorting or fluorescence microscopy.Identification of a subpopulation of stem cells expressing Zscan4 can beachieved with alternative methods, including, but not limited to, usingantibodies specific for Zscan4 or by in situ hybridization. In oneembodiment, the subpopulation of ES cells expressing Zscan4 isidentified by detecting expression of one or more Zscan4 co-expressedgenes, including AF067063, Tcstyl/Tcstv3, Tho4, Arginase II, BC061212and Gm428, Eif1a, EG668777 and Pif1.

Also described herein is an expression vector comprising a Trim43promoter sequence operably linked to a heterologous polypeptide. Theheterologous protein is typically a marker, an enzyme, or a fluorescentprotein. In one embodiment, the heterologous protein is the fluorescentprotein Strawberry. In some embodiments, the Trim43 promoter sequence isat least 70%, at least 80%, at least 90%, at least 95% or at least 99%identical to SEQ ID NO: 31. In another embodiment, the Trim43 promotercomprises SEQ ID NO: 31. In another embodiment, the Trim43 promoterconsists of SEQ ID NO: 31.

Also provided herein are isolated ES cells comprising the Zscan4 orTrim43 expression vectors described herein. In one embodiment, the EScells are a stable cell line.

VI. Transgenic Animals

The Zscan4 polynucleotide sequences disclosed herein can also be used inthe production of transgenic animals such as transgenic mice, asdescribed below. Transgenic animals can also be produced that containpolynucleotide sequences of one or more Zscan4 co-expressed genes,including AF067063, Tcstyl/Tcstv3, Tho4, Arginase II, BC061212 andGm428, Eif1a, EG668777 and Pif1.

In one embodiment, a non-human animal is generated that carries atransgene comprising a nucleic acid encoding Zscan4 operably linked to apromoter. Specific promoters of use include, but are not limited to, atissue specific promoter such as, but not limited to, an immunoglobulinpromoter, a neuronal specific promoter, or the insulin promoter.Specific promoters of use also include a constitutive promoter, such as,but not limited to, the thymidine kinase promoter or the human β-globinminimal, or an actin promoter, amongst others. The Zscan4 promoter canalso be used.

In another embodiment, the transgenic non-human animal carries atransgene comprising a nucleic acid encoding a heterologous peptide,such as a marker, enzyme or fluorescent protein, operably linked to aZscan4 promoter. In one example, the Zscan4 promoter is a Zscan4cpromoter, or a portion thereof. In another embodiment, the Zscan4cpromoter comprises the nucleic acid sequence set forth as nucleotides1-2540 of SEQ ID NO: 28, such as nucleotides 1-2643, 1-3250, or 1-3347of SEQ ID NO: 28. In one example, the heterologous peptide is thefluorescent protein Emerald.

In another embodiment, the transgenic non-human animal carries atransgene comprising a nucleic acid encoding a heterologous peptide,such as a marker, enzyme or fluorescent protein, operably linked to aTrim43 promoter. In one example, the Trim43 promoter comprises thenucleotide sequence of SEQ ID NO: 31, or a portion thereof. The portionof SEQ ID NO: 31 to be included in the expression vector is at least aportion of SEQ ID NO: 31 that is capable of promoting transcription ofthe heterologous polypeptide in a cell in which Trim43 is expressed. Insome embodiments, the Trim43 promoter sequence is at least 70%, at least80%, at least 90%, at least 95% or at least 99% identical to SEQ ID NO:31. In another embodiment, the Trim43 promoter comprises SEQ ID NO: 31.In another embodiment, the Trim43 promoter consists of SEQ ID NO: 31. Inone example, the heterologous peptide is the fluorescent proteinStrawberry.

In another embodiment, the transgenic non-human animal carries twotransgenes, a transgene comprising the Zscan4 promoter linked to anucleic acid sequence encoding a heterologous peptide, and a transgenecomprising the Trim43 promoter linked to a nucleic acid sequenceencoding a heterologous peptide, as described above. In some cases, thetransgenic non-human animal is a mouse comprising the Zscan4 promotertransgene and the Trim43 promoter transgene. In one specific example,the heterologous polypeptide operably linked to the Zscan4 promotersequence is the fluorescent protein Emerald and the heterologouspolypeptide operably linked to the Trim43 promoter sequence is thefluorescent protein Strawberry. This mouse is referred to as a “rainbow”mouse (see Example 10 below).

Embryos obtained from transgenic “rainbow” animals exhibit green colorat the late 2-cell stage and red color at the 4-cell to morula stages(with strongest expression at the morula stage). The expression of thesecolors at the proper timing and intensity indicates the progress of acorrect genetic program, and thus, can be used as indicators of properdevelopment of preimplantation embryos. These embryos have a variety ofapplications, including, but not limited to development of optimizedculture media for human embryos for in vitro fertilization (IVF);training of technicians and clinicians in the IVF clinic and researchlaboratories; testing of chemical compounds and drugs for embryotoxicity; and as indicators of successful nuclear reprogramming fornuclear transplantation/cloning procedures.

The nucleic acid sequences described herein can be introduced into avector to produce a product that is then amplified, for example, bypreparation in a bacterial vector, according to conventional methods(see, for example, Sambrook et al., Molecular Cloning: a LaboratoryManual, Cold Spring Harbor Press, 1989). The amplified construct isthereafter excised from the vector and purified for use in producingtransgenic animals.

Any transgenic animal can be of use in the methods disclosed herein,provided the transgenic animal is a non-human animal. A “non-humananimal” includes, but is not limited to, a non-human primate, a farmanimal such as swine, cattle, and poultry, a sport animal or pet such asdogs, cats, horses, hamsters, rodents, or a zoo animal such as lions,tigers or bears. In one specific, non-limiting example, the non-humananimal is a transgenic animal, such as, but not limited to, a transgenicmouse, cow, sheep, or goat. In one specific, non-limiting example, thetransgenic animal is a mouse. In a particular example, the transgenicanimal has altered proliferation and/or differentiation of a cell typeas compared to a non-transgenic control (wild-type) animal of the samespecies.

A transgenic animal contains cells that bear genetic informationreceived, directly or indirectly, by deliberate genetic manipulation atthe subcellular level, such as by microinjection or infection with arecombinant virus, such that a recombinant DNA is included in the cellsof the animal. This molecule can be integrated within the animal'schromosomes, or can be included as extrachromosomally replicating DNAsequences, such as might be engineered into yeast artificialchromosomes. A transgenic animal can be a “germ cell line” transgenicanimal, such that the genetic information has been taken up andincorporated into a germ line cell, therefore conferring the ability totransfer the information to offspring. If such offspring in fact possesssome or all of that information, then they, too, are transgenic animals.

Transgenic animals can readily be produced by one of skill in the art.For example, transgenic animals can be produced by introducing intosingle cell embryos DNA encoding a marker, in a manner such that thepolynucleotides are stably integrated into the DNA of germ line cells ofthe mature animal and inherited in normal Mendelian fashion. Advances intechnologies for embryo micromanipulation permit introduction ofheterologous DNA into fertilized mammalian ova. For instance, totipotentor pluripotent stem cells can be transformed by microinjection, calciumphosphate mediated precipitation, liposome fusion, retroviral infectionor other means. The transformed cells are then introduced into theembryo, and the embryo then develops into a transgenic animal. In onenon-limiting method, developing embryos are infected with a retroviruscontaining the desired DNA, and a transgenic animal is produced from theinfected embryo.

In another specific, non-limiting example, the appropriate DNA(s) areinjected into the pronucleus or cytoplasm of embryos, preferably at thesingle cell stage, and the embryos are allowed to develop into maturetransgenic animals. These techniques are well known. For instance,reviews of standard laboratory procedures for microinjection ofheterologous DNAs into mammalian (mouse, pig, rabbit, sheep, goat, cow)fertilized ova include: Hogan et al., Manipulating the Mouse Embryo,Cold Spring Harbor Press, 1986; Krimpenfort et al., Bio/Technology 9:86,1991; Palmiter et al., Cell 41:343, 1985; Kraemer et al., GeneticManipulation of the Early Mammalian Embryo, Cold Spring HarborLaboratory Press, 1985; Hammer et al., Nature 315:680, 1985; Purcel etal., Science 244:1281, 1986; U.S. Pat. No. 5,175,385; U.S. Pat. No.5,175,384.

VII. Antibodies

A Zscan4 polypeptide or a fragment or conservative variant thereof canbe used to produce antibodies which are immunoreactive or specificallybind to an epitope of a Zscan4. Polyclonal antibodies, antibodies whichconsist essentially of pooled monoclonal antibodies with differentepitopic specificities, as well as distinct monoclonal antibodypreparations are included. In one embodiment, the Zscan4 antibodiesrecognize all Zscan4 proteins, including Zscan4a, Zscan4b, Zscan4c,Zscan4d, Zscan4e, Zscan4f and human ZSCAN4. In another embodiment, theantibodies specifically recognize only one Zscan4 protein. As usedherein, the ability of an antibody to specifically a particular Zscan4protein means that the antibody detects expression of one Zscan4protein, but none of the other Zscan4 proteins. In an alternativeembodiment, the antibodies recognize two or more different Zscan4proteins. For example, a Zscan4 antibody may recognize only the Zscan4proteins comprising a SCAN domain (e.g., Zscan4c, Zscan4d, Zscan4f). Or,a Zscan4 antibody may recognize only the Zscan4 proteins comprising thezinc finger domains, but lacking the SCAN domain (e.g., Zscan4a,Zscan4b, Zscan4e).

Antibodies can also be raised against one or more proteins encoded bygenes identified herein as Zscan4 co-expressed genes. Thus, in someembodiments, a polypeptide encoded by AF067063, Tcstyl/Tcstv3, Tho4,Arginase II, BC061212 and Gm428, Eif1a, EG668777 or Pif1, or a fragmentor conservative variant thereof, can be used to produce antibodies whichare immunoreactive or specifically bind to an epitope of thepolypeptide.

In addition, antibodies can be generated that specifically bind Trim43.In one embodiment, a Trim43 polypeptide, or a fragment or conservativevariant thereof, can be used to produce antibodies which areimmunoreactive or specifically bind to an epitope of Trim43.

The preparation of polyclonal antibodies is well known to those skilledin the art. See, for example, Green et al., “Production of PolyclonalAntisera,” in: Immunochemical Protocols, pages 1-5, Manson, ed., HumanaPress, 1992; Coligan et al., “Production of Polyclonal Antisera inRabbits, Rats, Mice and Hamsters,” in: Current Protocols in Immunology,section 2.4.1, 1992.

The preparation of monoclonal antibodies likewise is conventional. See,for example, Kohler & Milstein, Nature 256:495, 1975; Coligan et al.,sections 2.5.1-2.6.7; and Harlow et al. in: Antibodies: a LaboratoryManual, page 726, Cold Spring Harbor Pub., 1988. Briefly, monoclonalantibodies can be obtained by injecting mice with a compositioncomprising an antigen, verifying the presence of antibody production byremoving a serum sample, removing the spleen to obtain B lymphocytes,fusing the B lymphocytes with myeloma cells to produce hybridomas,cloning the hybridomas, selecting positive clones that produceantibodies to the antigen, and isolating the antibodies from thehybridoma cultures. Monoclonal antibodies can be isolated and purifiedfrom hybridoma cultures by a variety of well-established techniques.Such isolation techniques include affinity chromatography with Protein-ASepharose, size-exclusion chromatography, and ion-exchangechromatography. See, e.g., Coligan et al., sections 2.7.1-2.7.12 andsections 2.9.1-2.9.3; Barnes et al., Purification of Immunoglobulin G(IgG), in: Methods in Molecular Biology, Vol. 10, pages 79-104, HumanaPress, 1992.

Methods of in vitro and in vivo multiplication of monoclonal antibodiesare well known to those skilled in the art. Multiplication in vitro maybe carried out in suitable culture media such as Dulbecco's ModifiedEagle Medium or RPMI 1640 medium, optionally supplemented by a mammalianserum such as fetal calf serum or trace elements and growth-sustainingsupplements such as normal mouse peritoneal exudate cells, spleen cells,thymocytes or bone marrow macrophages. Production in vitro providesrelatively pure antibody preparations and allows scale-up to yield largeamounts of the desired antibodies. Large-scale hybridoma cultivation canbe carried out by homogenous suspension culture in an airlift reactor,in a continuous stirrer reactor, or in immobilized or entrapped cellculture. Multiplication in vivo may be carried out by injecting cellclones into mammals histocompatible with the parent cells, such assyngeneic mice, to cause growth of antibody-producing tumors.Optionally, the animals are primed with a hydrocarbon, especially oilssuch as pristane (tetramethylpentadecane) prior to injection. After oneto three weeks, the desired monoclonal antibody is recovered from thebody fluid of the animal.

Antibodies can also be derived from a subhuman primate antibody. Generaltechniques for raising therapeutically useful antibodies in baboons canbe found, for example, in PCT Publication No. WO 91/11465, 1991; andLosman et al., Int. J. Cancer 46:310, 1990.

Alternatively, an antibody that specifically binds a Zscan4 polypeptidecan be derived from a humanized monoclonal antibody. Humanizedmonoclonal antibodies are produced by transferring mouse complementaritydetermining regions from heavy and light variable chains of the mouseimmunoglobulin into a human variable domain, and then substituting humanresidues in the framework regions of the murine counterparts. The use ofantibody components derived from humanized monoclonal antibodiesobviates potential problems associated with the immunogenicity of murineconstant regions. General techniques for cloning murine immunoglobulinvariable domains are described, for example, by Orlandi et al., Proc.Natl. Acad. Sci. U.S.A. 86:3833, 1989. Techniques for producinghumanized monoclonal antibodies are described, for example, by Jones etal., Nature 321:522, 1986; Riechmann et al., Nature 332:323, 1988;Verhoeyen et al., Science 239:1534, 1988; Carter et al., Proc. Natl.Acad. Sci. U.S.A. 89:4285, 1992; Sandhu, Crit. Rev. Biotech. 12:437,1992; and Singer et al., J. Immunol. 150:2844, 1993.

Antibodies can be derived from human antibody fragments isolated from acombinatorial immunoglobulin library. See, for example, Barbas et al.,in: Methods: a Companion to Methods in Enzymology, Vol. 2, page 119,1991; Winter et al., Ann. Rev. Immunol. 12:433, 1994. Cloning andexpression vectors that are useful for producing a human immunoglobulinphage library can be obtained, for example, from STRATAGENE CloningSystems (La Jolla, Calif.).

In addition, antibodies can be derived from a human monoclonal antibody.Such antibodies are obtained from transgenic mice that have been“engineered” to produce specific human antibodies in response toantigenic challenge. In this technique, elements of the human heavy andlight chain loci are introduced into strains of mice derived fromembryonic stem cell lines that contain targeted disruptions of theendogenous heavy and light chain loci. The transgenic mice cansynthesize human antibodies specific for human antigens, and the micecan be used to produce human antibody-secreting hybridomas. Methods forobtaining human antibodies from transgenic mice are described by Greenet al., Nature Genet. 7:13, 1994; Lonberg et al., Nature 368:856, 1994;and Taylor et al., Int. Immunol. 6:579, 1994.

Antibodies include intact molecules as well as fragments thereof, suchas Fab, F(ab′)₂, and Fv which are capable of binding the epitopicdeterminant. These antibody fragments retain some ability to selectivelybind with their antigen or receptor and are defined as follows:

(1) Fab, the fragment which contains a monovalent antigen-bindingfragment of an antibody molecule, can be produced by digestion of wholeantibody with the enzyme papain to yield an intact light chain and aportion of one heavy chain;

(2) Fab′, the fragment of an antibody molecule can be obtained bytreating whole antibody with pepsin, followed by reduction, to yield anintact light chain and a portion of the heavy chain; two Fab′ fragmentsare obtained per antibody molecule;

(3) (Fab′)₂, the fragment of the antibody that can be obtained bytreating whole antibody with the enzyme pepsin without subsequentreduction; F(ab′)₂ is a dimer of two Fab′ fragments held together by twodisulfide bonds;

(4) Fv, defined as a genetically engineered fragment containing thevariable region of the light chain and the variable region of the heavychain expressed as two chains; and

(5) Single chain antibody (SCA), defined as a genetically engineeredmolecule containing the variable region of the light chain, the variableregion of the heavy chain, linked by a suitable polypeptide linker as agenetically fused single chain molecule.

Methods of making these fragments are known in the art (see for example,Harlow and Lane, Antibodies: A Laboratory Manual, Cold Spring HarborLaboratory, New York, 1988). An epitope is any antigenic determinant onan antigen to which the paratope of an antibody binds. Epitopicdeterminants usually consist of chemically active surface groupings ofmolecules such as amino acids or sugar side chains and usually havespecific three dimensional structural characteristics, as well asspecific charge characteristics.

Antibody fragments can be prepared by proteolytic hydrolysis of theantibody or by expression in E. coli of DNA encoding the fragment.Antibody fragments can be obtained by pepsin or papain digestion ofwhole antibodies by conventional methods. For example, antibodyfragments can be produced by enzymatic cleavage of antibodies withpepsin to provide a 5S fragment denoted F(ab′)₂. This fragment can befurther cleaved using a thiol reducing agent, and optionally a blockinggroup for the sulfhydryl groups resulting from cleavage of disulfidelinkages, to produce 3.5 S Fab′ monovalent fragments. Alternatively, anenzymatic cleavage using pepsin produces two monovalent Fab′ fragmentsand an Fc fragment directly (see U.S. Pat. No. 4,036,945 and U.S. Pat.No. 4,331,647, and references contained therein; Nisonhoff et al., Arch.Biochem. Biophys. 89:230, 1960; Porter, Biochem. J. 73:119, 1959;Edelman et al., Methods in Enzymology, Vol. 1, page 422, Academic Press,1967; and Coligan et al. at sections 2.8.1-2.8.10 and 2.10.1-2.10.4).

Other methods of cleaving antibodies, such as separation of heavy chainsto form monovalent light-heavy chain fragments, further cleavage offragments, or other enzymatic, chemical, or genetic techniques may alsobe used, so long as the fragments bind to the antigen that is recognizedby the intact antibody.

For example, Fv fragments comprise an association of V_(H) and V_(L)chains. This association may be noncovalent (Inbar et al., Proc. Natl.Acad. Sci. U.S.A. 69:2659, 1972). Alternatively, the variable chains canbe linked by an intermolecular disulfide bond or cross-linked bychemicals such as glutaraldehyde. See, e.g., Sandhu, supra. Preferably,the Fv fragments comprise V_(H) and V_(L) chains connected by a peptidelinker. These single-chain antigen binding proteins (sFv) are preparedby constructing a structural gene comprising DNA sequences encoding theV_(H) and V_(L) domains connected by an oligonucleotide. The structuralgene is inserted into an expression vector, which is subsequentlyintroduced into a host cell such as E. coli. The recombinant host cellssynthesize a single polypeptide chain with a linker peptide bridging thetwo V domains. Methods for producing sFvs are known in the art (seeWhitlow et al., Methods: a Companion to Methods in Enzymology, Vol. 2,page 97, 1991; Bird et al., Science 242:423, 1988; U.S. Pat. No.4,946,778; Pack et al., Bio/Technology 11:1271, 1993; and Sandhu,supra).

Another form of an antibody fragment is a peptide coding for a singlecomplementarity-determining region (CDR). CDR peptides (“minimalrecognition units”) can be obtained by constructing genes encoding theCDR of an antibody of interest. Such genes are prepared, for example, byusing the polymerase chain reaction to synthesize the variable regionfrom RNA of antibody-producing cells (Larrick et al., Methods: aCompanion to Methods in Enzymology, Vol. 2, page 106, 1991).

Antibodies can be prepared using an intact polypeptide or fragmentscontaining small peptides of interest as the immunizing antigen. Thepolypeptide or a peptide used to immunize an animal can be derived fromsubstantially purified polypeptide produced in host cells, in vitrotranslated cDNA, or chemical synthesis which can be conjugated to acarrier protein, if desired. Such commonly used carriers which arechemically coupled to the peptide include keyhole limpet hemocyanin(KLH), thyroglobulin, bovine serum albumin (BSA), and tetanus toxoid.The coupled peptide is then used to immunize the animal (e.g., a mouse,a rat, or a rabbit).

Polyclonal or monoclonal antibodies can be further purified, forexample, by binding to and elution from a matrix to which thepolypeptide or a peptide to which the antibodies were raised is bound.Those of skill in the art will know of various techniques common in theimmunology arts for purification and/or concentration of polyclonalantibodies, as well as monoclonal antibodies (see, for example, Coliganet al., Unit 9, Current Protocols in Immunology, Wiley Interscience,1991).

It is also possible to use the anti-idiotype technology to producemonoclonal antibodies which mimic an epitope. For example, ananti-idiotypic monoclonal antibody made to a first monoclonal antibodywill have a binding domain in the hypervariable region that is the“image” of the epitope bound by the first mono-clonal antibody.

Binding affinity for a target antigen is typically measured ordetermined by standard antibody-antigen assays, such as competitiveassays, saturation assays, or immunoassays such as ELISA or RIA. Suchassays can be used to determine the dissociation constant of theantibody. The phrase “dissociation constant” refers to the affinity ofan antibody for an antigen. Specificity of binding between an antibodyand an antigen exists if the dissociation constant (K_(D)=1/K, where Kis the affinity constant) of the antibody is, for example <1 μM, <100nM, or <0.1 nM. Antibody molecules will typically have a K_(D) in thelower ranges. K_(D)=[Ab−Ag]/[Ab][Ag] where [Ab] is the concentration atequilibrium of the antibody, [Ag] is the concentration at equilibrium ofthe antigen and [Ab−Ag] is the concentration at equilibrium of theantibody-antigen complex. Typically, the binding interactions betweenantigen and antibody include reversible noncovalent associations such aselectrostatic attraction, Van der Waals forces and hydrogen bonds.

Effector molecules, e.g., therapeutic, diagnostic, or detection moietiescan be linked to an antibody that specifically binds Zscan4, using anynumber of means known to those of skill in the art. Exemplary effectormolecules include, but not limited to, radiolabels, fluorescent markers,or toxins (e.g. Pseudomonas exotoxin (PE), see “MonoclonalAntibody-Toxin Conjugates: Aiming the Magic Bullet,” Thorpe et al.,“Monoclonal Antibodies in Clinical Medicine,” Academic Press, pp.168-190, 1982; Waldmann, Science, 252: 1657, 1991; U.S. Pat. No.4,545,985 and U.S. Pat. No. 4,894,443, for a discussion of toxins andconjugation). Both covalent and noncovalent attachment means may beused. The procedure for attaching an effector molecule to an antibodyvaries according to the chemical structure of the effector. Polypeptidestypically contain a variety of functional groups; e.g., carboxylic acid(COOH), free amine (—NH₂) or sulfhydryl (—SH) groups, which areavailable for reaction with a suitable functional group on an antibodyto result in the binding of the effector molecule. Alternatively, theantibody is derivatized to expose or attach additional reactivefunctional groups. The derivatization may involve attachment of any of anumber of linker molecules such as those available from Pierce ChemicalCompany, Rockford, Ill. The linker can be any molecule used to join theantibody to the effector molecule. The linker is capable of formingcovalent bonds to both the antibody and to the effector molecule.Suitable linkers are well known to those of skill in the art andinclude, but are not limited to, straight or branched-chain carbonlinkers, heterocyclic carbon linkers, or peptide linkers. Where theantibody and the effector molecule are polypeptides, the linkers may bejoined to the constituent amino acids through their side groups (e.g.,through a disulfide linkage to cysteine) or to the alpha carbon aminoand carboxyl groups of the terminal amino acids.

In some circumstances, it is desirable to free the effector moleculefrom the antibody when the immunoconjugate has reached its target site.Therefore, in these circumstances, immunoconjugates will compriselinkages that are cleavable in the vicinity of the target site. Cleavageof the linker to release the effector molecule from the antibody may beprompted by enzymatic activity or conditions to which theimmunoconjugate is subjected either inside the target cell or in thevicinity of the target site.

In view of the large number of methods that have been reported forattaching a variety of radiodiagnostic compounds, radiotherapeuticcompounds, label (e.g. enzymes or fluorescent molecules) drugs, toxins,and other agents to antibodies, one skilled in the art will be able todetermine a suitable method for attaching a given agent to an antibodyor other polypeptide.

The following examples are provided to illustrate certain particularfeatures and/or embodiments. These examples should not be construed tolimit the invention to the particular features or embodiments described.

EXAMPLES

The characterization of Zscan4 is disclosed herein. Zscan4 is shownherein to exhibit transient and specific expression at the late 2-cellembryonic stage and in embryonic stem cells. Without being bound bytheory, Zscan4 is the only gene that is exclusively expressed during thefirst wave of de novo transcription, zygotic genome activation.

Zscan4 was identified from a cDNA clone derived from ES cells (clonenumber C0348C03) and subsequently sequenced by the Mammalian GeneCollection project (Gerhard et al. Genom Res. 14:2121-2127, 2004). ThecDNA sequence, deposited under Genbank Accession No. BC050218 (SEQ IDNO: 11), comprised 2292 by organized into 4 exons encoding a protein of506 amino acids. As described in the Examples below, using this cDNAclone as a probe, a high level of Zscan4 transcript was detected in late2-cell stage embryos. Since the original cDNA was isolated from EScells, RT-PCR was performed on RNAs derived from late 2-cell stageembryos and the amplification product was sequenced, as described in theExamples below. The amplified sequence was 2268 by in length and likethe cDNA isolated from ES cells, encoded a protein of 506 amino acids.Analysis of the nucleotide and amino acid sequences of the cDNA clonesisolated from ES cells and late 2-cell embryos showed they were twodifferent, but similar genes.

As described in the Examples below, nine Zscan4 gene copies wereidentified in the mouse genome. Three copies are pseudogenes and weredesignated Zscan4-ps1 (SEQ ID NO: 12), Zscan4-ps2 (SEQ ID NO: 13) andZscan4-ps3 (SEQ ID NO: 14), according to the convention of mouse genenomenclature. The remaining six gene copies are transcribed and encodeORFs, thus they were named Zscan4a (SEQ ID NOs: 15 and 16), Zscan4b (SEQID NOs: 17 and 18), Zscan4c (SEQ ID NOs: 19 and 20), Zscan4d (SEQ IDNOs: 21 and 22), Zscan4e (SEQ ID NOs: 23 and 24) and Zscan4f (SEQ IDNOs: 25 and 26). Zscan4c, Zscan4d and Zscan4f encode proteins of 506amino acids, while Zscan4a, Zscan4b and Zscan4e encode shorter proteinsof 360, 195 and 195 amino acids, respectively. A polypeptide comprisingany of the amino acid sequences set forth as SEQ ID NOs: 16, 18, 20, 22,24, 26 or 30, or a polynucleotide encoding these polypeptides, are ofuse in the methods disclosed herein. A polynucleotide encoding a Zscan4pseudogene set forth as SEQ ID NOs: 12, 13 or 14 are also of use in themethods disclosed herein.

Analysis of the expression levels of Zscan4 demonstrated that expressionof each of the six Zscan4 genes could be detected in ES cells withZscan4c being the predominant transcript. Zscan4d was the predominanttranscript in 2-cell stage embryos; however, low levels of Zscan4aZscan4e and Zscan4f could also be detected. These findings areconsistent with the origin of each cDNA clone since Zscan4c was derivedfrom the ES cell cDNA library and Zscan4d was derived from the 2-cellembryo cDNA library. Furthermore, expression of Zscan4 was not detectedin blastocysts (including the inner cell mass) or early blastocystoutgrowth. After approximately six days of outgrowth, Zscan4 expressionwas detected in a subpopulation of undifferentiated ES cells.

It is shown herein that expression of Zscan4 is temporally regulated andits expression or lack of expression at different embryonic stages iscritical to proper development. As described in the Examples below,inhibition of Zscan4 expression in embryos blocked the 2- to 4-cellembryonic transition, prevented blastocysts from expanding, preventedblastocysts from implanting and prevented proliferation of ES cells fromblastocyst outgrowths.

Also described herein is the development of a mouse ES cell lineexpressing a heterologous protein, Emerald, under the control of aZscan4 promoter. Further described is the identification of nine Zscan4co-expressed genes exhibiting 2-cell stage specific expression.

Also shown herein is the identification of Trim43 as a gene exhibitingexpression during the 4-cell to morula embryonic stages, with thehighest level of expression observed at the morula stage. Also describedherein is the development of a transgenic mouse, which comprises twotransgenes, the first comprising Emerald operably linked to the Zscan4cpromoter and the second comprising Strawberry operably linked to theTrim43 promoter.

Example 1 Materials and Methods Identification and Cloning of the MouseZscan4d Gene

Using DNA microarray data of mouse preimplantation embryos (Hamatani etal., Dev. Cell 6:117-131, 2004), Zscan4d gene was identified for itsspecific expression in 2-cell embryos. A corresponding cDNA clone (no.C0348C03; R1 ES cells, 129 strain; Genbank Accession No. BC050218, SEQID NO: 11) was identified in the mouse cDNA collection describedpreviously (Sharov et al., PLoS Bio. 1:E74, 2003). Based on thisfull-length cDNA sequence, a primer pair (5′-cctccctgggcttcttggcat-3′,SEQ ID NO: 1; 5′-agctgccaaccagaaagacactgt-3′, SEQ ID NO: 2) was designedand used to PCR-amplify the full-length cDNA sequence of this gene from2-cell embryos (B6D2F1 mouse). In brief, mRNA was extracted from 2-cellembryos and treated with DNAase (DNA-free, Ambion). The mRNA wasannealed with an oligo-dT primer and reverse-transcribed into cDNA withThermoScript Reverse Transcriptase (Invitrogen). A full-length cDNAclone was PCR-amplified with Ex Taq Polymerase (Takara Minis Bio,Madison, Wis.), purified with the Wizard SV Gel and PCR Clean-Up System(Promega Biosciences, San Luis Obispo, Calif.), cloned into a pENTRplasmid vector with the Directional TOPO Cloning Kit (Invitrogen), andcompletely sequenced using BigDye Terminator kit (PE Applied Biosystems,Foster City, Calif.) and DyeEX 96 Kit (Qiagen Valencia, Calif.) on ABI3100 Genetic Analyzer (PE Applied Biosystems). The sequence is set forthherein as SEQ ID NO: 21).

The WU-BLAST (available online) and UCSC genome browser were used toobtain Zscan4 orthologs in the human genome sequence. Open readingframes (ORFs) were deduced by ORF finder (available online from theNational Center for Biotechnology Information) and protein domains wereidentified by Pfam HMM database (available online). Orthologousrelationships were assessed with the phylogenetic tree of amino acidsequences determined by a sequence distance method and the NeighborJoining (NJ) algorithm (Saitou and Nei, 1987) using Vector NTI software(Invitrogen, Carlsbad, Calif.).

All gene names and gene symbols were consulted with and approved by themouse gene nomenclature committee.

Southern Blot Analysis

Southern blot analysis was carried out to validate the genome sequenceof the Zscan4 locus assembled using individual BAC clone sequencesdownloaded from the public database (RPCI-23 and RPCI-24 BAC libraries:C57BL/6J strain). A probe containing exon 3 was designed and amplifiedfrom mouse DNA extracted from ES cells (C57BL/6) using a primer pair(5′-gcattcctacataccaatta-3′, SEQ ID NO: 3; 5′-gatttaatttagctgggctg-3′,SEQ ID NO: 4). The PCR product was purified using GFX PCR DNA and Gelband purification kit (GE Healthcare). Fifteen μg of mouse genomic DNAextracted from ES cells (BL6.9 line derived from C57BL/6 strain) wasdigested overnight with restriction enzymes (MspI, TaqI, and MspI/TaqI,see FIG. 3B), fractionated on a 1% (w/v) agarose gel, transferred andimmobilized onto nitrocellulose membranes. Blots were hybridized withrandom-primed ³²P-labeled DNA probes under standard conditions.Membranes were subjected to 3 washes of 30 min each (2×SSC/0.1% (w/v)SDS at room temperature, 0.5×SSC/0.1% (w/v) SDS at 42° C., and0.1×SSC/0.1% (w/v) SDS at room temperature) and autoradiographed for 48hours at −80° C.

Measurement Of Gene Expression Levels

cDNAs from ES cells (129.3 ES cells purchased from the Transgenic CoreLaboratory of the Johns Hopkins University School of Medicine,Baltimore, Md.) and 2-cell embryos (B6D2F1 mice) were synthesized.Zscan4 cDNA fragments were amplified using a Zscan4-specific primer pair(Zscan4_For:5′-cagatgccagtagacaccac-3′, SEQ ID NO: 5; Zscan4 Rev5′-gtagatgttccttgacttgc-3′, SEQ ID NO: 6), which were100%-matched to allZscan4 paralogs. These cDNA fragments were sequenced using the followingprimers: Zscan4_For, 5′-cagatgccagtagacaccac-3′, SEQ ID NO: 5;Zscan4_(—)400Rev, 5′-ggaagtgttatagcaattgttc-3′, SEQ ID NO: 7; Zscan4Rev, 5′-gtagatgttccttgacttgc-3′, SEQ ID NO: 6; and Zscan4_(—)300Rev,5′-gtgttatagcaattgttcttg-3′, SEQ ID NO: 8. Electropherograms of thesesequences were used to calculate the relative expression levels of nineparalogous copies of Zscan4 in the following manner. Based on sequenceinformation of transcripts (either predicted from the genome sequence ordetermined by sequencing cDNA clones), nucleotide positions wereidentified where one or a few paralogous copies can be distinguishedbased on the nucleotide mismatches. The phred base calling program(version 0.020425.c (Ewing et al., Genome Res. 8:175-185, 1998)) wasused to obtain the amplitudes of all four bases in the electropherogramfor those nucleotide sites. After subtracting the noise level (i.e., theaverage of amplitudes of the bases that are not present in any of thenine paralogous copies), the amplitudes of each base (A, T, G, C) wereobtained. The expression levels of each of the paralogous copies werecalculated by the least square fitting, which found the expressionlevels that are most consistent with all mismatched nucleotidepositions.

Collection and Manipulation of Embryos

Four- to six-week old B6D2F1 mice were superovulated by injecting 5 IUpregnant mare serum gonadotropin (PMS; Sigma, St Louis, Mo., USA) and 5IU human chorionic gonadotropin (HCG; Sigma) after 46-47 h(Protocol#220MSK-Mi approved by the National Institute on Aging AnimalCare and Use Committee). Unfertilized eggs were harvested at 21 hpost-HCG according to the standard method (Nagy et al., 2003,“Manipulation of the Mouse Embryo, A Laboratory Manual,” Cold SpringHarbor Laboratory Press, New York). After removing cumulus cells byincubation in M2 medium (MR-0,5-D) supplemented with bovine testicularhyaluronidase (HY, 0.1% (w/v), 300 Umg-1), unfertilized eggs werethoroughly washed, selected for good morphology and collected.Fertilized eggs (1-cell embryos) were also harvested from matedsuperovulated mice in the same way as unfertilized eggs. Fertilized eggs(1-cell embryos) were cultured in synthetic oviductal medium enrichedwith potassium (KSOMaa MR-121-D) at 37° C. in an atmosphere of 5% CO2.For the embryo transfer procedure, 3.5 d.p.c. blastocysts weretransferred into the uteri of 2.5 d.p.c. pseudopregnant ICR female mice.

To synchronize in vitro embryo development, embryos with two pronuclei(PN) were selected. When some of these 1-cell stage embryos started tocleave, the early 2-cell stage embryos were selected and transferred toanother microdrop culture. The early 2-cell stage embryos were cultureduntil some of them started 2^(nd) cleavage and the embryos that werestill at the 2-cell stage were collected. These embryos weresynchronized at the late 2-cell stage.

DNA was microinjected into embryos according to the followingprocedures.

(1) Pronuclear injection: Plasmid vectors constitutively expressing asiRNA against mouse Zscan4 were constructed by inserting the followingtarget sequences in a pRNAT-U6.1/Neo vector (GenScript Corp., ScotchPlains, N.J., USA), shZscan4 (gagtgaattgctttgtgtc, SEQ ID NO: 9) andsiControl (randomized 21-mer, agagacatagaatcgcacgca, SEQ ID NO: 10).This vector contains a green fluorescence protein (GFP) marker under acytomegalovirus (CMV) promoter. For RNA interference experiments, 1-2 μl(2-3 ng/l) of a linearized vector DNA (shZscan4 or shControl) wasmicroinjected into the male pronucleus of zygotes. A plasmid vectorconstitutively expressing the Zscan4d gene was constructed by cloningthe CDS of Zscan4d into a plasmid pPyCAGIP (Chambers et al., Cell113:643-655, 2003). For overexpression experiments, 1-2 μl (2-3 ng/l) ofplasmid DNA (Zscan4d-inserted or no insert pPyCAGIP vector) linearizedby Seal was microinjected into the male pronucleus of zygotes.

(2) Cytoplasmic injection: Transient RNA interference experiments werecarried out by microinjecting ˜10 μl (5 ng/l) of oligonucleotide(siZscan4, plus-siZscan4, and siControl) into the cytoplasm of zygotes.The optimal amount of siRNA was determined by testing differentconcentrations of siRNA (4, 20, and 100 ng/μl).

All siRNAs were resuspended and diluted with the microinjection buffer(Specialty Media). The transfer of cultured blastocysts intopseudopregnant recipients was done according to the standard protocol(Nagy et al., 2003, “Manipulation of the Mouse Embryo, A LaboratoryManual,” Cold Spring Harbor Laboratory Press, New York). All media werepurchased from Specialty Media (Phillipsburg, N.J.).

Culture of ES Cells and Blastocyst Outgrowth

A mouse ES cell line (129.3 line derived from strain 129 and purchasedfrom The Transgenic Core Laboratory of the Johns Hopkins UniversitySchool of Medicine, Baltimore, Md., USA) was first cultured for twopassages into a gelatin-coated culture dish in the presence of leukemiainhibitory factor (LIF) to remove contaminating feeder cells. Cells werethen seeded on gelatin coated 6-well plates at the density of1-2×10⁵/well (1-2×10⁴/cm²) and cultured for 3 days with complete ESmedium (DMEM, 15% FBS; 1000 U/ml ESGRO (mLIF; Chemicon, Temecula,Calif.); 1 mM sodium pyruvate; 0.1 mM NEAA; 2 mM glutamate; 0.1 mMbeta-mercapto ethanol and 50 U/50 μg per ml penicillin/streptomycin).

For the outgrowth experiments, blastocysts at 3.5 days post coitum(d.p.c.) were cultured individually in DMEM (Gibco catalog no.10313-021) supplemented with 15% fetal bovine serum, 15 mM HEPES buffer,100 units/ml of penicillin, 100 μg/ml of streptomycin, 100 μMnonessential amino acids, 4.5 mM of L-glutamine, and 100 μM ofβ-mercapto ethanol on gelatinized chamber slides at 37° C. in 5% CO2.

Whole Mount In Situ Hybridization (WISH)

A plasmid DNA (clone C0348C03) was digested with SalI/NotI andtranscribed in vitro into digoxigenin-labeled antisense and sense probeas control. Embryos obtained from young (7 weeks old) B6D2F1/J mice werefixed in 4% paraformaldehyde and used to perform whole mount in situhybridization (WISH) according to the previously described protocol.WISH was also carried out on cultured ES cells according to the sameprotocol (Yoshikawa et al., Gene Expr. Patterns 6:213-224, 2006).

Quantitative Reverse Transcriptase PCR

Embryos for quantitative reverse transcriptase (qRT)-PCR experimentswere collected as described above and harvested at 23, 43, 55, 66, 80and 102 hours post-hCG for 1-cell, early 2 cell, late 2-cell, 4-cell,8-cell, morula and blastocyst embryos, respectively. Three subsets of 10synchronized and intact embryos were transferred in PBT 1X (PBSsupplemented 0.1% Tween X20) and stored in liquid nitrogen. These poolsof embryos were mechanically ruptured by a freeze/thaw and directly usedas a template for cDNA preparations. The Ovation system (NuGentechnologies, San Carlos, Calif., USA) was used to synthesize cDNAs fromeach pool. The cDNAs were then diluted to 1:25 in a total of 1000 μl and2 μl was used as a template for qPCR. The qPCR was performed on the ABI7900HT Sequence Detection System (Applied Biosystems, Foster City,Calif., USA) as previously described (Falco et al., Reprod. Biomed.Online 13:394-403, 2006) and data were normalized by Chuk and H2afz withthe ΔΔCt method (Falco et al., Reprod. Biomed. Online 13:394-403, 2006;Livak and Schmittgen, Methods 25:402-408, 2001). Embryos subjected toRNA interference experiments were analyzed in the same way as describedabove for the normal preimplantation embryos

Example 2 Identification of 2-Cell-Specific Genes During PreimplantationDevelopment

After fertilization, the maternal genetic program governed by maternallystored RNAs and proteins must be switched to the embryonic geneticprogram governed by de novo transcription, called zygotic genomeactivation (ZGA), from the newly-formed zygotic genome (DePamphilis etal., “Activation of Zygotic Gene Expression” In Advances inDevelopmental Biology and Biochemistry, Vol. 12, pp. 56-84, ElsevierScience B.V., 2002; Latham and Schultz, Front Biosci. 6:D748-759, 2001).The ZGA is one of the first and most critical events in animaldevelopment. Earlier reports have established that the ZGA begins duringthe 1-cell stage (Aoki et al., Dev. Biol. 181:296-307, 1997) (Nothias etal., J. Biol. Chem. 270:22077-22080, 1995; Ram and Schultz, Dev. Biol.156:552-556, 1993). However, global gene expression profiling by DNAmicroarray analysis has recently revealed that nearly all genesidentified for their increase of expression at the 1-cell stage wereinsensitive to alpha-amanitin treatment, which blocks RNA polymerase II(Hamatani et al., Dev. Cell 6:117-131, 2004; Zeng and Schultz, Dev.Biol. 283:40-57, 2005). Thus, these studies not only identified many ZGAgenes, but also revealed that de novo transcription of the zygoticgenome begins during the 2-cell stage of mouse preimplantationdevelopment (Hamatani et al., Dev. Cell 6:117-131, 2004; Zeng andSchultz, Dev. Biol. 283:40-57, 2005). Furthermore, it has been shownthat the major burst of ZGA does not occur at the early 2-cell stage,but during the late 2-cell stage (Hamatani et al., Dev. Cell 6:117-131,2004).

Arrest of development at the 2-cell stage has been reported for theloss-of-function mutants of Mater/Nalp5 (Tong et al., Nat. Genet.26:267-268, 2000), Mhr6a/Ube2a (Roest et al., Mol. Cell. Biol.24:5485-5495, 2004) and Brgl/Smarca4 (Bultman et al., Genes Dev.20:1744-1754, 2006). Although the timing of the developmental arrestcoincides with that of the ZGA, these genes are expressed duringoogenesis and stored in oocytes, but are not transcribed in the 2-cellstage. Therefore, these maternal effect genes are not suitable for thestudy of the ZGA. Previously the ZGA has been studied using eitherexogenous plasmid-borne reporter genes Nothias et al., J. Biol. Chem.270:22077-22080), or endogenous, but rather ubiquitously expressedgenes, such as Hsp70.1 (Christians et al., 1995), eIF-4C (Davis et al.,Dev. Biol. 174:190-201, 1996), Xist (Zuccotti et al., Mol. Reprod. Dev.61:14-20, 2002) and other genes (DePamphilis et al., “Activation ofZygotic Gene Expression” In Advances in Developmental Biology andBiochemistry, Vol. 12, pp. 56-84, Elsevier Science B.V., 2002). AlthoughTEAD-2/TEF-4 (Kaneko et al., Development 124:1963-1973, 1997) andPou5f1/Oct4 (Palmieri et al., Dev. Biol. 166:259-267, 1994) areconsidered as transcription factors selectively expressed at ZGA(DePamphilis et al., “Activation of Zygotic Gene Expression” In Advancesin Developmental Biology and Biochemistry, Vol. 12, pp. 56-84, ElsevierScience B.V., 2002), these genes are known to be expressed in cellsother than 2-cell embryos. It is thus important to identify and studyindividual ZGA genes, especially the genes expressed exclusively at the2-cell stage.

Global gene expression profiling of preimplantation embryos waspreviously carried out and a group of genes was identified that showedtransient spike-like expression in the 2-cell embryo (Hamatani et al.,Dev. Cell 6:117-131, 2004). By examining the expression of these genesin the public expressed sequence tag (EST) database (NCBI/NIH), a novelgene was identified represented by only 29 cDNA clones out of 4.7million mouse ESTs. These cDNA clones have been isolated from cDNAlibraries derived from ES cells and preimplantation embryos.Furthermore, the previous DNA microarray data showed that the expressionof this gene is detected in ES cells, but not in embryonal carcinoma(EC) cells (F9 and P19), trophoblast stem (TS) cells, or neuralstem/progenitor (NS) cells (Aiba et al., Stem Cells 24:889-895, 2006).

One of the cDNA clones derived from ES cells (clone number C0348C03;(Sharov et al., PLoS Biol. 1:E74, 2003)) was completely sequenced by theMammalian Gene Collection (MGC) project (Genbank Accession No. BC050218;SEQ ID NO: 11 (Gerhard et al., Genome Res. 14:2121-2127, 2004)). Wholemount in situ hybridization (WISH) using this cDNA clone as a probedetected high level of transcripts in late 2-cell embryos (FIG. 1A). Thetranscript was not detected in unfertilized eggs and embryos in otherpreimplantation stages including 3-cell embryos, suggesting a highspecificity of gene expression at the late 2-cell stage and a relativelyshort half-life of the transcripts. Quantitative reverse-transcriptasePCR (qRT-PCR) analysis confirmed the WISH results (FIG. 1B). Previousmicroarray analysis showed that the expression of this gene at the late2-cell stage was suppressed in embryos treated with α-amanitin (ablocker of RNA pol II-based transcription) (Hamatani et al., Dev. Cell6:117-131, 2004), confirming that this gene is transcribed de novoduring the major burst of ZGA. The transient expression pattern wasobserved in both in vitro cultured embryos and freshly isolated in vivoembryos (Hamatani et al., Dev. Cell 6:117-131, 2004).

Example 3 Structure and Expression of Zscan4 Paralogous Genes

The full-length cDNA sequence (BC050218; SEQ ID NO: 11) of 2292 by wasorganized into 4 exons, encoding a protein of 506 amino acids (FIG. 2A).Because this cDNA clone was isolated from a cDNA library made from EScells (Sharov et al., PLoS Biol. 1:E74, 2003), another cDNA clone wasisolated by performing RT-PCR on RNAs isolated from late 2 cell-stageembryos and completely sequenced (SEQ ID NO: 21). This 2268 by cDNAclone encoded a protein of 506 amino acids. DNA sequence and proteinsequences clearly showed that these two cDNAs (SEQ ID NOs: 11 and 21)were two different genes with close similarity. Domain predictionanalysis revealed a SCAN (Leucine Rich Element) domain and four zincfinger domains at the N- and C-terminal ends, respectively (FIG. 2B). Ahypothetical human ortholog-zinc finger and SCAN domain containing 4(ZSCAN4) was also identified that shares 45% of amino acid sequencesimilarity with the high conservation in SCAN (50%) and zinc fingerdomains (59%) (FIG. 7).

Alignment of full-length cDNA sequences (SEQ ID NOs: 11 and 21) to themouse genome sequence (mm7) revealed multiple hits in the proximalregion of chromosome 7, the syntenic region of human ZSCAN4 (FIG. 8).One notable feature of this genome region was repetitions of a verysimilar sequence segment. The sequences of each copy of Zscan4 and thesurrounding region were very similar to each other, leaving theassembled genome sequences of this region less accurate than those ofother regions. To understand the genome structure of this region better,individual BAC clone sequences were manually reassembled from thisregion into ˜850 kb genome sequence contigs (FIG. 3A). Because it wasdifficult to find a hybridization probe or oligonucleotides todistinguish each copy, restriction enzymes were used that candistinguish small sequence differences among gene copies. Southern blotanalysis was carried out by digesting C57BL/6J mouse genomic DNAs withTaqI alone, MspI alone, or TaqI/MspI (FIGS. 3B and C). All the detectedDNA fragments confirmed nine paralogous Zscan4 genes predicted in theassembled genome sequences.

The full-length cDNA sequence (BC050218; SEQ ID NO: 11) was then alignedto the assembled genome sequence and nine gene copies were found, all ofwhich had multi-exon gene organizations (FIG. 2, 3A). Three gene copieswere apparently pseudogenes as no evidence was found that they weretranscribed based on available EST information and sequencing analysisof RT-PCR products. Therefore, the genes were named Zscan4-ps1 (SEQ IDNO: 12), Zscan4-ps2 (SEQ ID NO: 13), and Zscan4-ps3 (SEQ ID NO: 14),according to the convention of mouse gene nomenclature. Because theremaining 6 gene copies were transcribed and encoded ORFs, they werenamed Zscan4a (SEQ ID NO: 15), Zscan4b (SEQ ID NO: 17), Zscan4c (SEQ IDNO: 19), Zscan4d (SEQ ID NO: 21), Zscan4e (SEQ ID NO: 23) and Zscan4f(SEQ ID NO: 25). Three of the these genes, Zscan4a, Zscan4b, andZscan4e, encoded ORFs of 360, 195 and 195 amino acids, respectively,which included the SCAN domain, but not the four zinc finger domains(FIG. 2B).

The remaining three genes, Zscan4c, Zscan4d and Zscan4f, encodedfull-length ORFs (506 amino acids). The main features of these genes aresummarized in FIG. 3A. Zscan4c corresponds to the cDNA clone isolatedfrom ES cells (C0348C03; Genbank Accession No. BC050218; Gm397; SEQ IDNO: 11). Zscan4d corresponds to the cDNA clone isolated from 2-cellembryos (SEQ ID NO: 21). Zscan4f corresponds to a gene predicted fromthe genome sequence (Genbank Accession No. XM_(—)145358; SEQ ID NO: 27).Similarities of both ORFs and mRNAs between these three genes were veryhigh (FIG. 7). Thus, it is most likely that these three genes have thesame function. To measure the expression levels of each paralog, DNAsequences of the nine Zscan4 paralogs were analyzed by the Clustal Xmultiple-sequence alignment program, which showed the presence ofsequence differences specific to each paralog. To examine the expressionlevels of each gene in 2-cell embryos and ES cells, cDNA fragmentsamplified by RT-PCR from 2-cell embryos and ES cells were sequenced. Theexpression level of each paralog were estimated based on the amplitudesof each nucleotide at polymorphic sites. The results are summarized inFIG. 3A. In 2-cell embryos, Zscan4d was a predominant transcript (90%).In contrast, in ES cells, Zscan4c was a predominant transcript (40%),although Zscan4f was a lesser, but significant transcript (24%). Theseresults were consistent with the origin of each cDNA clone; Zscan4c wasderived from the ES cell cDNA library, whereas Zscan4d was derived fromthe 2-cell embryo library.

Example 4 Function of Zscan4 in Preimplantation Development

As a first step to characterize the function of Zscan4 genes, thestudies focused on preimplantation development. Initially a possibilityto carry out a standard gene targeting strategy was explored, but it wasdifficult for the following three reasons. First, sequences of Zscan4paralogs and surrounding genomic regions are too similar to designtargeting constructs for specific genes. Second, it is highly likelythat Zscan4d^(−/−) phenotype can be compensated functionally by otherZscan4 paralogs, because in addition to predominantly-expressed Zscan4d,at least 3 other similar copies (Zscan4a, Zscan4e, and Zscan4f) werealso transcribed in 2-cell embryos. Third, the presence of otherpredicted genes, though not annotated as genes yet, within ˜850 kbZscan4 locus makes a strategy to delete the entire Zscan4 locus lessattractive. Therefore, siRNA technology was used. Although RNAi andsiRNA technology has been successfully used for blocking the expressionof specific genes in preimplantation embryos (Kim et al., Biochem.Biopys. Res. Commun. 296:1372-1377, 2002; Stein et al., Dev. Biol.286:464-471, 2005), widely-recognized off-target effects are generally amajor concern (Jackson et al., Rna 12:1179-1187, 2006; Scacheri et al.,Proc. Natl. Acad. Sci. U.S.A. 101:1892-1897, 2004; Semizarov et al.,Proc. Natl. Acad. Sci. U.S.A. 100:6347-6352, 2003). To increase theconfidence of the effects by siRNA against Zscan4, the siRNA experimentswere carried out by three independent siRNA technologies, anoligonucleotide-based siRNA (denoted here siZscan4 and obtained fromInvitrogen); a vector-based shRNA (denoted here shZscan4 and obtainedfrom Genscript); and a mixture of oligonucleotide siRNAs (denoted hereplus-siZscan4 and obtained from Dharmacon) (FIG. 4A, B). Oligonucleotidesequences used for siZscan4, shZscan4, plus-siZscan4 matched 100% withcDNA sequences of Zscan4a, Zscan4b, Zscan4c, Zscan4d, Zscan4e andZscan4f, except for shZscan4 with 2 by mismatches with Zscan4b andZscan4e (FIG. 4A, B).

A shZscan4 vector was microinjected into the male pronucleus of zygotesat 21-23 hours after the hCG injection and embryos were observed duringpreimplantation development (FIGS. 4C and D). At 61 hours post-hCG, whenthe majority (58.8%) of shControl-injected embryos have already reachedthe 4-cell stage, the majority (78.8%) of shZscan4-injected embryosremained at the 2-cell stage. By 98 hours post-hCG, when the majority(70.0%) of shControl-injected embryos have reached blastocyst stage, themajority (52.5%) of shZscan4-injected embryos reached only morula stage.A significant reduction (−95%) of Zscan4 RNA levels was confirmed by theqRT-PCR analysis (FIG. 4E). Taken together, these results indicate thatthe development of shZscan4-injected embryos was delayed for about 24hrs between the 2- and 4-cell stages, followed by progression to thelater stages at a speed comparable to that of shControl-injectedembryos. Essentially the same results were obtained using two differentsiRNA technologies: siZscan4 (FIG. 9) and plus-siZscan4 (FIG. 10).

siZscan4-injected embryos formed normal looking early blastocysts (3.5d.p.c.), but often failed to form expanded blastocysts (4.5 d.p.c.; 45%of siZscan4-injected embryos versus 6% of siControl-injected embryos;FIG. 9B). To test whether these blastocysts had any compromise even at3.5 d.p.c., shZscan4-injected blastocysts were transferred to the uterusof pseudo-pregnant mice. None of the shZscan4-injected blastocystsimplanted, whereas most shControl-injected embryos implanted (Table 1).In vitro blastocyst outgrowth experiments determined that cells ofshZscan4-injected blastocysts failed to proliferate in culture (Table1). These results clearly demonstrated that the transient expression ofZscan4 at the late 2-cell stage is required for the development ofproper blastocysts.

TABLE 1 Blastocyst outgrowth (A) and post-implantation development (B)of embryos received pronuclear injection of shZscan4 or shControl ABlastocyst Number of tested Number of Outgrowth blastocysts successfuloutgrowth shZscan4 16 0 shControl 17 7 Number of blastocysts Btransferred to pseudo- Number of pups Embryo Transfer pregnant motherborn shZscan4  8 0 shControl 10 4 *A shZscan4 or shControl vector wasmicroinjected into the male pronucleus of zygotes at 21-23 hours afterthe hCG injection. Early blastocysts (3.5 d.p.c.) formed from theseembryos were subjected to tests of blastocysts outgrowth (A) and embryotransfer (B). In the outgrowth assay, the presence of proliferatingcells after 6 days in culture was considered as successful outgrowth.

The notion that the reduction of Zscan4 expression level delays thedevelopment of preimplantation embryos at the 2-cell stage was furthersupported by the fact that when shZscan4 was injected into one of theblastomeres of early 2-cell stage embryos, ˜28% of embryos became 3-cellembryos (FIG. 5A). One blastomere that received shZscan4 injectionremained as a 2-cell blastomere, whereas the other blastomere cleavedinto two smaller blastomeres with the size of 4-cell blastomeres (FIG.5D). Subsequently, these embryos (24%) became unevenly cleaved embryos,typically 5-cell embryos, with one 2-cell-sized blastomere and four8-cell-sized blastomeres (FIG. 5B, E). These embryos eventually formedblastocyst-like structures, but they seemed to be the mixtures ofblastocyst-like cell mass and morula-like cell mass, which was oftenGFP-positive, a marker for shRNA-injected blastomere (FIG. 5C, F, G). Incontrast, when shControl was injected into one of the blastomeres at theearly 2-cell stage, nearly all embryos cleaved normally (FIG. 5A, B, C).

To investigate the effect of prolonged Zscan4d expression onpreimplantation development, Zscan4d was overexpressed by microinjectinga Zscan4d-expressing plasmid into the male pronucleus of zygotes.Although the Zscan4d plasmid-injected embryos showed a rate ofdevelopment similar to control plasmid-injected embryos, the formerblastocysts failed to produce the outgrowth (Table 2A) and failed toimplant (Table 2B). The results suggest that the timely downregulationof Zscan4d is also important for the proper development of blastocysts.

TABLE 2 Blastocyst outgrowth (A) and post-implantation development (B)of embryos received pronuclear injection of a Zscan4d-expressing plasmidor a control plasmid A Number of tested Number of Blastocyst Outgrowthblastocysts successful outgrowth Zscan4d-expressing 10 2 plasmid Controlplasmid 15 11  Number of blastocysts B transferred to pseudo- EmbryoTransfer pregnant mother Number of pups Zscan4d-expressing 10 0 plasmidControl plasmid 14 5 *A plasmid vector constitutively expressing Zscan4dgene or control empty vector was microinjected into the male pronucleusof zygotes at 21-23 hours after the hCG injection. Early blastocysts(3.5 d.p.c.) formed from these embryos were subjected to the same testsas described in Table 1.

Example 5 Analysis of Zscan4 Expression Using the Whole Mount In SituHybridization (WISH)

One intriguing aspect of the expression pattern of Zscan4 is theexclusive expression in late 2-cell embryos and ES cells. This appearsto be counter-intuitive, because ES cells are derived from the ICM andmany genes that are expressed in ES cells are also expressed in the ICM(e.g., Yoshikawa et al., Gene Expr. Patterns 6:213-224, 2006). Thereforethe expression of Zscan4 in blastocysts, blastocyst outgrowth, and EScells was examined using WISH. The results demonstrated that theexpression of Zscan4 was not detected anywhere in blastocysts, includingthe ICM and the early blastocyst outgrowth (FIG. 6A). However, theexpression of Zscan4 began to be detected in a small fraction of cellsby the day 6 of the outgrowth. Surprisingly, the strong expression ofZscan4 was detected in only a small fraction of ES cells inundifferentiated colonies. In contrast, the expression of Pou5f1(Oct3/4), a well-known marker for pluripotency, was detected in the ICMof blastocysts, a large fraction of the cells in the blastocystoutgrowth, and the majority of ES cells in undifferentiated colonies(FIG. 6A). Due to the close similarity of cDNA sequences, each Zscan4paralog could not be distinguished by WISH, but the expression analysisby sequencing RT-PCR products mentioned above indicates that Zscan4c andZscan4f were the genes detected in the subpopulation of the cells inblastocyst outgrowth and ES cells by WISH.

Example 6 Zscan4 Promoter Expression Vector

As described in previous Examples herein, Zscan4 expression is onlydetected in a subpopulation of undifferentiated ES cells. In order toidentify this subpopulation of ES cells, and to identify any other cellexpressing Zscan4, an expression plasmid was developed which comprises aZscan4c promoter sequence and the Emerald reporter gene (a variant ofgreen fluorescent protein). The components and orientation of theexpression vector are illustrated in FIG. 11. The sequence of theZscan4c promoter-Emerald expression vector is set forth as SEQ ID NO:28. The nucleotide ranges of SEQ ID NO: 28 of the components of theexpression vector are provided in Table 3.

TABLE 3 Zscan4c Promoter-Emerald Expression Vector Nucleotides ofComponent SEQ ID NO: 28 Zscan4c promoter   1-3347 TATA box 2483-2489Zscan4c exon 1 2541-2643 Zscan4c intron 1 2644-3250 Zscan4c exon 2(partial) 3251-3347 Emerald start codon 3398-3400 Emerald reporter gene3398-4117 TK poly A signal 4132-4403 EM7 promoter 5257-5323 Blasticidinselection gene 5330-5722 SV40 polyA signal 5880-6010

Mouse ES cells were transfected with the Zscan4c promoter expressionvector and analyzed by fluorescence activated cell sorting to identifyEmerald-positive cells and Emerald-negative cells. If Zscan4 isexpressed in a cell, it is Emerald-positive. The results showapproximately 3-5% of mouse ES cells express Zscan4 (FIG. 12).

Sorted cells were collected and analyzed by quantitative real time PCR(qPCR) for expression of Zscan4c and Pou5f1 (also known as Oct3, Oct4,Oct3/4), a well known marker for pluripotency. As shown in FIG. 12,Pou5f1 is expressed at the same level in both Emerald-positive andEmerald-negative cells, whereas Zscan4c is more highly expressed inEmerald-positive cells than in Emerald-negative cells. The data indicatethat the Zscan4c promoter sequence used in this vector can reproduce theexpression of endogenous Zscan4c gene, and thus the Zscan4cpromoter-Emerald expression vector can be used to purifyZscan4-expressing cells. The data also indicate that bothZscan4-expressing cells and non-expressing cells retain thepluripotency-marker Pou5f1 expression, thus this subpopulation of EScells cannot be identified by a standard pluripotency marker.

Example 7 Mouse ES Cell Line Expressing Emerald Under Control of theZscan4 Promoter

A mouse ES cell line was established in which the Zscan4c promoterexpression vector described in Example 6 was stably incorporated intothe cells. The ES cell line expresses Emerald under control of theZscan4c promoter. After transfecting a linearlized plasmid DNA intomouse ES cells, the cells were cultured in the presence of theselectable marker (blasticidin). The blasticidin-resistant ES cellclones were isolated and used for further analysis.

As described herein, Zscan4 is only expressed in a subpopulation ofundifferentiated ES cells (approximately 3-5% of ES cells). Accordingly,the ES cell line incorporating the Zscan4 promoter expression vectorexhibits expression in only a small percentage, approximately threepercent, of cells.

Example 8 Identification of Nine Genes Co-Expressed with Zscan4 in aSub-Population of ES Cells

Using the mouse ES cell line stably transfected with the Zscan4cpromoter (as described in Example 7), DNA microarray analysis wasperformed to compare gene expression patterns of Emerald(+) andEmerald(−) cells. Emerald(+) and Emerald(−) cells were sorted by FACSand total RNAs were isolated from each cell population. These RNAs werelabeled and hybridized to the NIA-Agilent 44K DNA microarray (AgilentTechnologies).

Nine genes were identified as being co-expressed with Zscan4: AF067063,Tcstyl/Tcstv3, Tho4, Arginase II, BC061212 and Gm428, Eif1a, EG668777and Pif1. In situ hybridization was performed to confirm expression ofthese genes in mouse ES cells. The 2-cell embryo-specific expressionprofiles of six of these genes (AF067063, Tcstv3, Tho4, Arginase II,BC061212 or Gm428) are shown in FIGS. 13A-G.

Example 9 Trim43 is Specifically Expressed in 4-Cell to Morula StageEmbryos

To identify genes that are specifically expressed at the 8-cell andmorula stages, publicly available EST frequency data (TIGR Mouse GeneIndex; MGI Library Expression Search; NIA Mouse Gene Index (Sharov etal., PLoS Bio. 1:E74, 2003)) and microarray data from mousepreimplantation embryos (Hamatani et al., Dev. Cell 6(1):117-31, 2004)were used. After selecting candidate genes, quantitative RT-PCR analysiswas carried out to confirm the specific expression pattern of Trim43(tripartite motif-containing protein 43).

Trim43 expression was detected beginning at the 4-cell embryonic stageand peaked at the morula stage. A low level of Trim43 expression wasdetected in blastocysts. The function of the Trim43 protein is unknown.The nucleotide and amino acid sequences of Trim43 are provided herein asSEQ ID NO: 32 and SEQ ID NO: 33, respectively. The nucleic acid sequenceof the Trim43 promoter is provided herein as SEQ ID NO: 31.

Example 10 Transgenic “Rainbow” Mouse

As described herein, an expression vector comprising a Zscan4c promoteroperably linked to a first heterologous polypeptide (Emerald) and anexpression vector comprising a Trim43 promoter operably linked to asecond heterologous polypeptide (Strawberry), have been generated. Atransgenic mouse (a “rainbow” mouse) can be generated which incorporatesboth of these expression constructs.

A 7155 base pair DNA fragment containing the Insulator-Zscan4promoter-emerald and TK polyA and a 8672 base pair DNA fragmentcontaining the Insulator-Trim43 promoter-Strawberry are co-injected intothe pronucleus of fertilized mouse eggs (B6C3 X B6).

Embryos obtained from the rainbow mouse will exhibit green color (as aresult of expression of Emerald) at the late 2-cell stage, and red color(due to expression of Strawberry) from the 4-cell stage to the morulastage (with peak expression at the morula stage). The expression ofEmerald and Strawberry at the appropriate stage of embryonic developmentindicates proper development of the embryo. Thus, these embryos will beuseful for a number of research and clinical purposes. For example,embryos obtained from the rainbow mouse can be used to develop optimizedculture conditions for embryos, which can be applied to human embryosused in the IVF clinic. In addition, these embryos can be used to testchemical compounds or drugs for toxicity to the embryo. The embryos canalso be used as indicators of successful nuclear reprogramming fornuclear transplantation procedures.

This disclosure provides methods of inhibiting differentiation of stemcells and promoting blastocyst outgrowth of ES cells. The disclosurefurther provides a Zscan4 promoter sequence and methods of use,including identification of a subpopulation of stem cells expressingZscan4. It will be apparent that the precise details of the methodsdescribed may be varied or modified without departing from the spirit ofthe described invention. We claim all such modifications and variationsthat fall within the scope and spirit of the claims below.

1. A method of inhibiting differentiation of a stem cell, comprisingincreasing the expression of Zscan4 in the stem cell, thereby inhibitingdifferentiation of the stem cell.
 2. The method of claim 1 whereinincreasing the expression of Zscan4 comprises increasing the expressionof Zscan4a, Zscan4b, Zscan4c, Zscan4d, Zscan4e, Zscan4f, human ZSCAN4 ora combination thereof.
 3. The method of claim 1, wherein increasing theexpression of Zscan4 in the cell comprises transfecting the stem cellswith a nucleic acid encoding Zscan4 operably linked to a promoter. 4.The method of claim 3, wherein the nucleic acid encoding Zscan4comprises SEQ ID NO:
 60. 5. The method of claim 3, wherein the nucleicacid encoding Zscan4 is at least 95% identical to Zscan4c (SEQ ID NO:19), Zscn4d (SEQ ID NO: 21) or Zscan4f (SEQ ID NO: 25). 6-7. (canceled)8. The method of claim 3, comprising transfecting the cells with avector comprising the nucleic acid encoding Zscan
 4. 9. (canceled) 10.The method of claim 1, wherein inhibiting the differentiation of thestem cell increases the viability of the stem cell.
 11. The method ofclaim 1, wherein inhibiting the differentiation of the stem cellprevents senescence of the stem cell.
 12. The method of claim 1, whereinthe stem cell is an embryonic stem cell, an embryonic germ cell, agermline stem cell or a multipotent adult progenitor cell.
 13. A methodof promoting blastocyst outgrowth of embryonic stem cells, comprisingincreasing the expression of Zscan4 in the embryonic stem cells, therebypromoting blastocyst outgrowth of the embryonic stem cells.
 14. Themethod of claim 13, wherein increasing the expression of Zscan4comprises increasing the expression of Zscan4a, Zscan4b, Zscan4c,Zscan4d, Zscan4e, Zscan4f, human ZSCAN4 or a combination thereof. 15.The method of claim 13, wherein increasing the expression of Zscan4 inthe cell comprises transfecting the stem cells with a nucleic acidencoding Zscan4 operably linked to a promoter.
 16. The method of claim15, wherein the nucleic acid encoding Zscan4 comprises SEQ ID NO: 6064.17. The method of claim 15, wherein the nucleic acid encoding Zscan4 isat least 95% identical to Zscan4c (SEQ ID NO: 19), Zscn4d (SEQ ID NO:21) or Zscan4f (SEQ ID NO: 25). 18-19. (canceled)
 20. The method ofclaim 15, comprising transfecting the cells with a vector comprising thenucleic acid encoding Zscan
 4. 21. (canceled)
 22. A method ofidentifying a subpopulation of stem cells expressing Zscan4, comprisingtransfecting the cells with an expression vector comprising a Zscan4promoter and a reporter gene, wherein expression of the reporter geneindicates Zscan4 is expressed in the subpopulation of stem cells. 23.The method of claim 22, wherein the Zscan4 promoter is a Zscan4cpromoter.
 24. The method of claim 23, wherein the Zscan4c promotercomprises the nucleic acid sequence set forth as: nucleotides 1-2540 ofSEQ ID NO: 28, nucleotides 1-2643 of SEQ ID NO: 28, nucleotides 1-3250of SEQ ID NO: 28, nucleotides 1-3347 of SEQ ID NO: 28, or SEQ ID NO: 28.25-28. (canceled)
 29. An isolated expression vector comprising a Zscan4cpromoter operably linked to a nucleic acid encoding a heterologouspolypeptide.
 30. The isolated expression vector of claim 29, wherein theZscan4c promoter comprises the nucleic acid sequence set forth asnucleotides 1-2540 of SEQ ID NO: 28, nucleotides 1-2643 of SEQ ID NO:28, nucleotides 1-3250 of SEQ ID NO: 28, or nucleotides 1-3347 of SEQ IDNO:
 28. 31.-33. (canceled)
 34. The isolated expression vector of claim29, wherein the polypeptide is a marker, enzyme, or fluorescent protein.35.-36. (canceled)
 37. An isolated embryonic stem cell comprising theexpression vector of claim
 29. 38. An isolated expression vectorcomprising a Trim43 promoter operably linked to a nucleic acid encodinga heterologous polypeptide.
 39. The isolated expression vector of claim38, wherein the Trim43 promoter comprises the nucleic acid sequence setforth as SEQ ID NO:
 31. 40. The isolated expression vector of claim 38,wherein the polypeptide is a marker, enzyme, or fluorescent protein.41.-42. (canceled)
 43. An isolated embryonic stem cell comprising theexpression vector of claim
 38. 44. A method of identifying asubpopulation of stem cells, wherein the stem cells express Zscan4,comprising detecting expression of one or more of AF067063,Testv1/Tcstv3, Tho4, Arginase II, BC061212 and Gm428, Eif1a, EG668777and Pif1.
 45. An isolated stem cell identified according to the methodof claim 40.